Check it out on GitHub

How do USB keyboards work?

This section describes on a very high level how most USB 2.0 keyboards interact with their connected computer (the host) via USB, from connection to device setup to keypress handling. It is not intended to be a definitive explanation, just an overview. For more detailed explanations, I recommend the USB in a NutShell series of articles, parts of which have been included in this section. The full 650 page USB 2.0 specification can also be useful as a reference.

Overview

USB stands for Universal Serial Bus, a name which identifies the three main properties of USB:

USB 2.0 is a single host, multiple device protocol. There must be exactly one host device coordinating all communication on the wire. In most cases, the host is a PC or laptop, and the devices are any connected peripherals like keyboards, mice, webcams etc. Each device is assigned a unique 7-bit address when it first connects to the host (a process known as device enumeration), so up to 127 devices can be connected to a single USB bus at a time. Most computers will have multiple USB busses, e.g. one per external USB port.

The USB protocol defines several data transfer modes. Which mode a device chooses depends on the intended use of the device:

USB keyboards spend most of their time transferring data in the interrupt mode because the host needs to react quickly to keypress events to avoid perceivable lag.

The USB protocol identifies the functionality of a device using device classes, defined by a class code sent to the host. This allows the host to e.g. load drivers for the device and to support new devices from different manufacturers, making USB universal. The USB standard specifies some standard device classes, but manufacturers can implement their own using the wildcard “vendor-specific” class type. One standard-defined class is the Human Interface Device (HID) class; this class includes devices intended to be used by humans to interact with a computer, such as keyboards and mice. The threeboard implements this USB HID class to identify itself as a keyboard.

Connector wiring

The wiring of a USB 2.0-only connector (i.e. a connector capable of up to a 480 Mbit/s “high speed” data rate) is quite simple, with only 4 pins needed:

USB 3.0 requires an additional 5 pins for a total of 9, but keeps the same connector types. Sometimes these have blue inserts to indicate USB 3.0 compatibility, but retain backwards compatibility with USB 2.0. Type A connectors are only used on hosts (e.g. computers), and Type B connectors are only used on devices (e.g. phones, keyboards), but they have the same wiring and pin configuration. Newer Type C connectors (often called “USB-C”) are omnidirectional and are the only connector type that supports USB 4.0. These connectors are radically different, with 24 pins, but still maintain backwards compatibility with all previous USB standards. This is possible because USB-C contains the 4-pins USB 2.0 configuration as a subset of its 24 pins.

The differential pair data line is an implementation of differential signalling. Two wires are used but the data line remains serial.

The remainder of this document assumes the USB 2.0 standard. Although as discussed, keyboards implementing USB 2.0 can still have USB-C connectors by using its USB 2.0 subset.

Hardware and firmware

It’s important to think of a USB keyboard as a computer in its own right. To be able to fully implement the USB protocol, keyboards generally contain a 5V microcontroller with clock speeds up to 80MHz, as much as 100 KB of RAM, and hardware support for the USB protocol. These microcontrollers use RISC instruction sets, such as ARM or AVR. Firmware runs on the keyboards microcontroller, implementing everything from checking for keypresses and lighting status LEDs to implementing the USB protocol and communicating with the host. QMK is a popular ARM and AVR-compatible open-source keyboard firmware. The threeboard uses its own custom-built firmware, available in this repository.

Device enumeration

USB device enumeration is the process where hosts detect, identify and in some cases load drivers for a USB device. This section won’t go into any of the electrical details of how hosts detect when a device has been plugged in, instead it will focus on how devices use the USB protocol to identify themselves to the host. However one important electrical feature of this process is how the host determines the speed of the device; the USB D+ is connected to VBUS with a pull-up resistor, indicating full-speed 12 Mbps mode. Connecting USB D- instead would specify the low-speed 1.5 Mbps operating mode.

When a USB device is being enumerated, it cycles through several configuration states: default, addressed and configured. Once a device has been plugged in and is receiving power from the host, it enters the default state. In this state, it has not even been assigned an address yet. The process from default to configured (i.e. usable) state is visualised below:

The threeboard’s enumeration values (such as its device descriptor and additional HID descriptors) are all specified in usb/internal/descriptors.h.

Sending keypress events

During enumeration, a USB keyboard will have provided an endpoint descriptor to the host. This is used to describe all of the USB endpoints a device supports. Endpoints are essentially isolated data buffers on the USB device, with important properties including a transfer direction (IN, meaning device to host, and OUT, meaning host to device) and a transfer type (control, interrupt, bulk, isochronous). USB devices all must provide at least one control endpoint (endpoint 0), which is the endpoint used during enumeration. Devices may provide multiple additional endpoints e.g. for interrupting the host with keypress data.

USB keyboards may be configured differently depending on their functionality, but in the most basic case (and in the case of the threeboard), a USB keyboard provides only one additional endpoint, an IN endpoint 1, with an interrupt transfer type. The host polls the device as frequently as every 1ms (this polling period is configurable in the endpoint descriptor) to check if the device has an interrupt in its endpoint buffer waiting to be sent to the host. A USB HID-compliant keyboard device will put a keyboard input report message in its IN buffer, which contains up to 6 keypress codes and a modifier code, identifying all of the modifier keys being pressed. The keyboard input report is 8 bytes long and contains one modifier keycode byte, one reserved byte, and 6 keycode bytes. A table mapping these bytes to key codes is defined in section 10 of the HID usage tables document.

An example host polling sequence is visualised below: