
Automating Split DNS
Photo Credits: Unsplash
The What and Why of Split DNS
DNS can be considered 'split-brain' when there are two versions of a single zone (e.g. nicholas-morris.com), such that zone records resolve differently for different groups of clients. Typically, this takes the form of intranet users retrieving zone records from a local DNS server (like a Windows Server, or the network edge device), while external users retrieve the same zone from an external DNS provider, like Google or Cloudflare. The zone records in the external provider will generally be restricted to public-facing resources and will (of course) point to the public IPs of those resources, while the internal DNS server's records will typically point to the private IPs.

Why go to the hassle? Internal resources.
One of the most common use cases is Active Directory, which requires DNS to function. Internal hosts need to resolve Domain Controllers through DNS, but you don't want external hosts retrieving those records. Even if the records only provide private IPs, they still reveal more about your internal environment than you ought to be comfortable with.
As an alternative, you could use two entirely separate domains, (e.g. [company].com and internal-[company].com), but that's no less complex from a configuration and management perspective - plus it seems a great deal less elegant. A common pattern is to use a subdomain for internal resources (e.g. internal.company.com). This is still considered split DNS - as 'internal.company.com' is a child of the 'company.com' zone - but it greatly simplifies management and configuration, and would completely preempt the "solution" I am about to present.
The Issue with Split DNS
If, like me, you have one or more environments with split DNS that do not leverage the subdomain pattern, you may encounter a particular problem with external-facing resources.
Consider one of my lab environments with the following components:
- An AD domain on the parent zone (not a subdomain as described in the previous section)
- A number of ephemeral, external-facing resources that automate their own DNS record creation and modification via Cloudflare's API. For the most part, these workloads are exposed via cloudflared, and are not accessible on any internal IP. (That could be changed, but I don't have a use case for it.)
The problem?
Inside the network, hosts have to use internal DNS to work with the AD domain on [zone].com. My external-only resources have DNS records like [resource].[zone].com. When a client attempts to resolve [resource].[zone].com from inside the local network, Windows DNS will return a resolution failure; The internal DNS server is 'authoritative' for [zone].com, and therefore does not attempt to forward the unknown query to an upstream server.
The easiest solution is to manually add a record to the internal server for each resource. However, I find this distateful, particularly when the external record is created automatically, and it certainly wouldn't scale under a 'real' workload.
So, the question becomes: How can I - without repeated manual interventions - resolve these resources from within the network?
The Solution
Restricting ourselves to automated solutions, there are two promising options:
- Automate Windows DNS record creation on the same trigger as Cloudflare record creation, or
- 'Sync' Cloudflare records to Windows DNS
#1 is much harder than it may seem. There are many dynamic DNS (DDNS) clients around, but the vast majority (including the one I use for my Cloudflare records) function by calling the APIs of a public DNS provider. These will not work with Windows Server, which does not have an https endpoint. Windows Server is compliant with RFC 2136, though, and DDNS clients based on this protocol do exist. Unfortunately, Windows Server (to my knowledge) only provides one way to make authenticated updates via the 2136 method - Kerberos - and this makes the automation complex, depending on the host and tool you are trying to authenticate from. (And I would not recommend solving this conundrum by allowing unauthenticated record updates.) I do not know of any DDNS clients that support authenticated Windows Server updates out of the box. (That said, please let me know if you see one!) Ultimately, I've decided this method is more trouble than it's worth, given that route #2 is available.
#2 is much more user-friendly, as Cloudflare provides an excellent and easy-to-use API. It also creates convenient CNAMEs that can point our Windows DNS entries to Cloudflare records, rather than re-creating A records that may become out-of-sync. For example, if we create a Cloudflare A record, 'a.b.com', Cloudflare will automatically create a CNAME, 'a.b.com.cdn.cloudflare.net'. This allows us to create a CNAME in our internal DNS records, pointing to the Cloudflare CNAME, that will always return the most up-to-date DNS information to our clients.
Below is a script I've created to run on my Windows DNS Server via Scheduled Tasks. It creates local CNAMEs for all A records in a given Cloudflare zone.
An Example Script
# I recommend retrieving these values from an environment file rather than hard-coding them
$api_token = ""
$zone_id = ""
$zone_name = ""
$logfile = ""
$response = Invoke-WebRequest -URI `
  https://api.cloudflare.com/client/v4/zones/$zone_id/dns_records `
 -Headers @{"Content-Type"="application/json"; "Authorization"="Bearer $api_token"}
function Write-Log {
    Param(
        [string]$Content,
        [string]$Path
    )
    $time = Get-Date -UFormat "%m/%d/%Y %H:%M:%S"
    $entry = "${time}: $Content"
    Write-Output $entry
    Write-Output $entry >> $Path
}
if ($response.StatusCode -ne 200) {
    Write-Log -Content "status code other than 200, exiting" -Path $logfile
    exit
} else {
    Write-Log -Content "received response with status code 200, processing contents" -Path $logfile
}
$remote_records = ($response.content | ConvertFrom-Json).Result
$remote_arecords = $remote_records | where type -eq "A"
foreach ($record in $remote_arecords) { $record.name = $record.name.replace(".$zone_name", "") }
$local_records = @{}
foreach ($record in $(Get-DnsServerResourceRecord -ZoneName $zone_name)) {
    if ($record.hostname -notmatch "_" -and $record.hostname -notmatch "@") {
        if ($record.HostName -notin $local_records.Keys) {
            $local_records.Add($record.HostName, $record)
        }
    }
}
$recordstoadd = @()
foreach ($remote_arecord in $remote_arecords) {
    $name = $remote_arecord.name
    if ($name -ne $zone_name -and $name -notin $local_records.Keys) {
        $recordstoadd += $remote_arecord
    }
}
if ($recordstoadd.length -eq 0) {
    Write-Log -Content "no new records detected, exiting" -Path $logfile
} else {
    foreach ($remote_arecord in $recordstoadd) {
        $cname = [string]::join(".", @($remote_arecord.name, $zone_name, "cdn.cloudflare.net"))
        Add-DnsServerResourceRecordCName -Name $remote_arecord.name -HostNameAlias $cname `
				  -ZoneName $zone_name
	Write-Log -Content "Added $cname to local dns" -Path $logfile
    }
}
