forked from sr2/www.sr2.uk
180 lines
9.6 KiB
Markdown
180 lines
9.6 KiB
Markdown
|
|
---
|
||
|
|
title: "Using TLS ECH from Python"
|
||
|
|
date: 2025-01-10T13:00:00-00:00
|
||
|
|
tags:
|
||
|
|
- DEfO
|
||
|
|
- ECH
|
||
|
|
- OpenSSL
|
||
|
|
- Python
|
||
|
|
- TLS
|
||
|
|
params:
|
||
|
|
author: 'Iain Learmonth'
|
||
|
|
---
|
||
|
|
|
||
|
|
At first, the idea of encrypting more of the metadata found inside the initial packet (the "ClientHello") of a TLS
|
||
|
|
connection may seem simple and obvious, but there are of course reasons that this wasn't done right from the start.
|
||
|
|
In this post I will describe the flow of a connection using Encrypted Client Hello (ECH) to protect the metadata fields,
|
||
|
|
and present a working code example using a fork of CPython built with DEfO project's OpenSSL fork to connect to
|
||
|
|
ECH-enabled HTTPS servers.
|
||
|
|
|
||
|
|
To understand why this is an issue, let's take a step back and look at how websites are hosted.
|
||
|
|
Many websites are hosted on shared servers, which means that a single server machine is responsible for serving
|
||
|
|
multiple, possibly hundreds or thousands, of websites.
|
||
|
|
This is known as the shared hosting model.
|
||
|
|
In this setup, when a user types in a URL or clicks on a link to visit a website and the browser connects to the server,
|
||
|
|
the server needs to know which website the users is requesting.
|
||
|
|
This is where the Server Name Indication (SNI) comes in - it's a field in the initial packet of a TLS connection that
|
||
|
|
tells the server which website the user is trying to access.
|
||
|
|
The server can then send the correct certificate so that the browser can authenticate the connection, and then send the
|
||
|
|
requested website content.
|
||
|
|
|
||
|
|
Because this field was sent unencrypted, this means that anyone who can see the traffic between the user's browser and
|
||
|
|
the server can intercept the SNI and know which website the user is trying to visit.
|
||
|
|
This can be a privacy concern, as it allows ISPs, network administrators, or other unwanted observers to build a profile
|
||
|
|
of the user's browsing history.
|
||
|
|
It's not just about the websites they visit, but also about the potential for censorship or targeted attacks.
|
||
|
|
With the SNI being unencrypted, it's like sending a postcard with the address visible to anyone who handles it - it may
|
||
|
|
not be the end of the world for most browsing activity, but it's certainly not private.
|
||
|
|
Encrypted Client Hello aims to change this by encrypting the SNI and other metadata, making it much harder for third
|
||
|
|
parties to intercept and exploit this information.
|
||
|
|
|
||
|
|
So, why wasn't it easy to protect the SNI and other metadata from the start?
|
||
|
|
The main challenge was that, in order to encrypt the SNI, the client (i.e., the user's browser) needs to know the
|
||
|
|
public key that the server wants the ClientHello to be encrypted with in advance.
|
||
|
|
However, the server's ECH public key is tied to the specific website being requested, and there wasn't a straightforward
|
||
|
|
way to discover a public key that could be used to talk to the server without revealing the SNI.
|
||
|
|
This created a chicken-and-egg problem, where the client couldn't encrypt the SNI without knowing the server's public
|
||
|
|
key, but it couldn't know the server's public key without sending the SNI in plaintext.
|
||
|
|
|
||
|
|
This problem is solved with ECH by introducing a new type of DNS record, called an
|
||
|
|
[HTTPS record](https://datatracker.ietf.org/doc/html/rfc9460).
|
||
|
|
An HTTPS record is a special type of DNS record that contains the ECH public key of the server, along with other metadata,
|
||
|
|
in a way that can be retrieved by the client without revealing the SNI (the website name is still leaked via the DNS
|
||
|
|
request, but it is possible to protect your requests using DNS-over-TLS or DNS-over-HTTPS).
|
||
|
|
The HTTPS record is typically retrieved by the client during the DNS lookup process, before the TLS connection is
|
||
|
|
established.
|
||
|
|
|
||
|
|
The HTTPS record contains an ECH configuration, which is used to encrypt the SNI and other metadata.
|
||
|
|
This is generated by the server and is tied to the specific configuration of the server, rather than to a specific
|
||
|
|
website.
|
||
|
|
By using HTTPS records to retrieve the server's ECH public key, we are able to break the chicken-and-egg problem and
|
||
|
|
provide a way to encrypt the SNI and other metadata.
|
||
|
|
|
||
|
|
Before we can lookup the HTTPS record, it's first necessary to work out where that record would live.
|
||
|
|
These records have been designed to be quite flexible, so can accommodate services running on non-default port numbers.
|
||
|
|
If the default port number is in use then the HTTPS record will be on the same domain name as the website, but for
|
||
|
|
non-default port numbers, there will be a prefix to the domain name:
|
||
|
|
|
||
|
|
```python
|
||
|
|
def svcbname(url: str) -> str:
|
||
|
|
"""Derive DNS name of SVCB/HTTPS record corresponding to target URL."""
|
||
|
|
parsed = urllib.parse.urlparse(url)
|
||
|
|
if parsed.scheme == "https":
|
||
|
|
if (parsed.port or 443) == 443:
|
||
|
|
return parsed.hostname
|
||
|
|
else:
|
||
|
|
return f"_{parsed.port}._https.{parsed.hostname}"
|
||
|
|
elif parsed.scheme == "http":
|
||
|
|
if (parsed.port or 80) in (443, 80):
|
||
|
|
return parsed.hostname
|
||
|
|
else:
|
||
|
|
return f"_{parsed.port}._https.{parsed.hostname}"
|
||
|
|
else:
|
||
|
|
# For now, no other scheme is supported
|
||
|
|
return None
|
||
|
|
```
|
||
|
|
|
||
|
|
To keep it simple, the examples in this post will use plain DNS but the technique is equally applicable to DNS-over-TLS
|
||
|
|
and DNS-over-HTTPS. Now that we have the domain name to query, we can fetch the ECH configuration from the DNS using
|
||
|
|
the [dnspython](https://www.dnspython.org/) library:
|
||
|
|
|
||
|
|
```python
|
||
|
|
def get_ech_configs(domain) -> List[bytes]:
|
||
|
|
try:
|
||
|
|
answers = dns.resolver.resolve(domain, "HTTPS")
|
||
|
|
except dns.resolver.NoAnswer:
|
||
|
|
logging.warning(f"No HTTPS record found for {domain}")
|
||
|
|
return []
|
||
|
|
except Exception as e:
|
||
|
|
logging.critical(f"DNS query failed: {e}")
|
||
|
|
sys.exit(1)
|
||
|
|
configs: List[bytes] = []
|
||
|
|
for rdata in answers:
|
||
|
|
if hasattr(rdata, "params"):
|
||
|
|
params = rdata.params
|
||
|
|
echconfig = params.get(5)
|
||
|
|
if echconfig:
|
||
|
|
configs.append(echconfig.ech)
|
||
|
|
if len(configs) == 0:
|
||
|
|
logging.warning(f"No echconfig found in HTTPS record for {domain}")
|
||
|
|
return configs
|
||
|
|
```
|
||
|
|
|
||
|
|
Once the ECH configurations are known, these can be used to establish the connection and fetch the website:
|
||
|
|
|
||
|
|
```python
|
||
|
|
def get_http(url, ech_configs) -> bytes:
|
||
|
|
parser = urllib.parse.urlparse(url)
|
||
|
|
hostname, port, path = url.hostname, url.port, url.path
|
||
|
|
logging.debug("Performing GET request for https://{hostname}:{port}/{path}")
|
||
|
|
context = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
|
||
|
|
context.load_verify_locations(certifi.where())
|
||
|
|
for config in ech_configs:
|
||
|
|
try:
|
||
|
|
context.set_ech_config(config)
|
||
|
|
except ssl.SSLError as e:
|
||
|
|
logging.error(f"SSL error: {e}")
|
||
|
|
pass
|
||
|
|
with socket.create_connection((hostname, port)) as sock:
|
||
|
|
with context.wrap_socket(sock, server_hostname=hostname, do_handshake_on_connect=False) as ssock:
|
||
|
|
try:
|
||
|
|
ssock.do_handshake()
|
||
|
|
logging.debug("Handshake completed with ECH status: %s", ssock.get_ech_status().name)
|
||
|
|
logging.debug("Inner SNI: %s, Outer SNI: %s", ssock.server_hostname, ssock.outer_server_hostname)
|
||
|
|
request = f'GET {path} HTTP/1.1\r\nHost: {hostname}\r\nConnection: close\r\n\r\n'
|
||
|
|
ssock.sendall(request.encode('utf-8'))
|
||
|
|
response = b''
|
||
|
|
while True:
|
||
|
|
data = ssock.recv(4096)
|
||
|
|
if not data:
|
||
|
|
break
|
||
|
|
response += data
|
||
|
|
return response
|
||
|
|
except ssl.SSLError as e:
|
||
|
|
logging.error(f"SSL error: {e}")
|
||
|
|
raise e
|
||
|
|
```
|
||
|
|
|
||
|
|
The important step here is the new
|
||
|
|
[`set_ech_config`](https://irl.github.io/cpython/library/ssl.html#ssl.SSLContext.set_ech_config) method on the
|
||
|
|
`SSLContext` that allows you to add the ECH configuration containing the public key.
|
||
|
|
If there are multiple records, the underlying OpenSSL will determine which of the keys to use.
|
||
|
|
There are also a few new methods that allow you to get the status information relating to ECH from the `SSLSocket`
|
||
|
|
after the completion of the handshake.
|
||
|
|
|
||
|
|
In the simple case, that's all there is to it.
|
||
|
|
If you were to watch the connection with Wireshark you would not be able to see the true SNI being sent to the server
|
||
|
|
and would only see the decoy SNI present in the unencrypted "ClientHelloOuter".
|
||
|
|
This decoy SNI is added to appease [middleboxes](https://en.wikipedia.org/wiki/Middlebox) that may block traffic,
|
||
|
|
accidentally or deliberately, if that field is missing entirely.
|
||
|
|
There are also further protections against such middleboxes from the application of GREASE:
|
||
|
|
|
||
|
|
> If the client attempts to connect to a server and does not have an ECHConfig structure available for the server, it
|
||
|
|
> SHOULD send a GREASE "encrypted_client_hello" extension in the first ClientHello [...]
|
||
|
|
|
||
|
|
This means that if your client supports ECH but does not have the configuration available to use it, the client should
|
||
|
|
still send an ECH extension filled with nonsense anyway.
|
||
|
|
This will help to detect deployment issues early as errors will be immediately obvious to users and won't rely on
|
||
|
|
servers having deployed ECH before the errors are triggered.
|
||
|
|
|
||
|
|
Finally, if the server sees this GREASE ECH extension then it can use this to know that you support ECH but didn't
|
||
|
|
have a configuration available.
|
||
|
|
In its reply, it can send a "retry config" and then terminate the connection.
|
||
|
|
You then have the configuration available to start the connection again with a real ECH extension this time, and can
|
||
|
|
cache that for future requests too.
|
||
|
|
|
||
|
|
For a full client example including the use of retry configs, you can see our
|
||
|
|
[example Python client](https://github.com/defo-project/docker-defo-client/blob/main/pyclient.py) at GitHub.
|
||
|
|
You'll need to use this with our [CPython fork](https://github.com/irl/cpython) and
|
||
|
|
[OpenSSL fork](https://github.com/defo-project/openssl).
|