Binary is the base-2 number system, using only the digits 0 and 1. Each digit in a binary number is called a bit, and its positional value doubles with each position from right to left: the rightmost bit has value 1, the next has value 2, then 4, 8, 16, 32, 64, and 128 for the leftmost bit of a standard 8-bit byte. A nibble is a group of 4 bits, representing values from 0 to 15. A byte is a group of 8 bits, representing values from 0 to 255. Understanding these positional values is the foundation for all number system conversions encountered in security work.
Hexadecimal is the base-16 number system, using digits 0-9 and letters A-F (where A=10, B=11, C=12, D=13, E=14, F=15). Each hexadecimal digit represents exactly one nibble (4 bits), and two hexadecimal digits represent one byte. This property makes hexadecimal the preferred representation in all security tools, packet analysis software, debuggers, and cryptographic output — it is a compact, human-readable encoding of binary data. Hexadecimal values are conventionally prefixed with 0x to distinguish them from decimal, though in contexts where hex is unambiguous (such as color codes or memory addresses) the prefix may be omitted. Every security professional needs fluency in hex representation.
- Bit positional values (right to left): 1, 2, 4, 8, 16, 32, 64, 128
- Nibble: 4 bits representing 0-15 decimal (0-F hex)
- Byte: 8 bits representing 0-255 decimal (0x00-0xFF hex)
- Each hex digit = exactly one nibble; two hex digits = exactly one byte
- Hex prefix convention:
0x2Aor2Ah— both indicate hexadecimal notation
Converting from decimal to binary: divide the decimal number by 2 repeatedly, recording each remainder. The binary representation is the remainders read from bottom to top. For decimal 42: 42/2=21 remainder 0, 21/2=10 remainder 1, 10/2=5 remainder 0, 5/2=2 remainder 1, 2/2=1 remainder 0, 1/2=0 remainder 1. Reading remainders bottom to top: 101010, padded to 8 bits: 00101010. Converting from binary back to decimal: sum the positional values of all bits set to 1. In 00101010, bits at positions 1 (value 2), 3 (value 8), and 5 (value 32) are set: 2 + 8 + 32 = 42.
Converting between binary and hexadecimal is particularly straightforward due to the 4-bit alignment. Split the binary number into groups of 4 bits (nibbles) from right to left, and convert each nibble to its hexadecimal digit. The binary value 00101010 splits as 0010 | 1010, which is 2 and A in hex, giving 0x2A. Decimal 42 = binary 00101010 = hex 0x2A. To convert hex to binary, expand each hex digit to its 4-bit binary equivalent: hex 0x2A = 0010 1010 in binary. This direct correspondence makes mental conversion between binary and hex fast once you memorize the 16 nibble values: 0000=0, 0001=1, 0010=2, 0011=3, 0100=4, 0101=5, 0110=6, 0111=7, 1000=8, 1001=9, 1010=A, 1011=B, 1100=C, 1101=D, 1110=E, 1111=F.
- Decimal to binary: repeated division by 2, collecting remainders from bottom to top
- Binary to decimal: sum the positional values (1, 2, 4, 8, 16, 32, 64, 128) of all 1-bits
- Decimal 42 = binary 00101010 = hex 0x2A
- Binary to hex: group into nibbles of 4 bits, convert each nibble to hex digit
- Hex to binary: expand each hex digit to its 4-bit binary equivalent
ASCII (American Standard Code for Information Interchange) is a 7-bit character encoding standard mapping 128 characters to decimal values 0-127. The printable characters start at decimal 32 (space) and include uppercase letters (A=65 through Z=90), lowercase letters (a=97 through z=122), digits (0=48 through 9=57), and punctuation. Control characters occupy the range 0-31: null is 0, carriage return (CR) is 13, line feed (LF) is 10, horizontal tab is 9, and escape is 27. In hex representation: uppercase A is 0x41, lowercase a is 0x61, the digit 0 is 0x30, and null is 0x00.
ASCII knowledge is directly applicable in security contexts. When examining packet captures in Wireshark or tcpdump, the hex dump pane shows raw bytes alongside their ASCII representation — recognizing HTTP headers, SMTP commands, and other text protocols in hex is an essential skill for protocol analysis. When reviewing malware disassembly or memory dumps, string constants appear as sequences of ASCII values; recognizing that the hex sequence 47 45 54 20 2F is "GET /" helps identify HTTP communication patterns. URL encoding replaces non-ASCII or reserved characters with percent-encoded hex values — %20 is space (decimal 32, hex 0x20), and recognizing these patterns is important for detecting injection attacks in web logs.
- Uppercase A = decimal 65 = hex 0x41 = binary 01000001
- Lowercase a = decimal 97 = hex 0x61 = binary 01100001
- Null byte = decimal 0 = hex 0x00 — used in buffer overflow payloads and string termination
- CR (carriage return) = decimal 13 = hex 0x0D; LF (line feed) = decimal 10 = hex 0x0A
- URL encoding: space = %20, slash = %2F, colon = %3A — percent sign followed by two hex digits
TCP flags are packed into a 9-bit field within the TCP header, where each bit position corresponds to a specific control flag. In hexadecimal representation, common flag combinations are: SYN (synchronize, initiating a connection) = 0x002, ACK (acknowledgment) = 0x010, SYN-ACK (completing the three-way handshake) = 0x012, FIN (finish, graceful close) = 0x001, RST (reset, abrupt close or connection refused) = 0x004, and PSH (push, deliver data to application immediately) = 0x008. In a Wireshark packet capture showing a SYN scan (like those performed by Nmap), packets with flags=0x002 indicate SYN packets sent to each target port to test for open status.
Beyond TCP, many protocol fields pack multiple pieces of information into single bytes using bitwise operations. The IP header's Type of Service byte contains a 6-bit DSCP (Differentiated Services Code Point) value and a 2-bit ECN (Explicit Congestion Notification) field. The DNS message header packs the query/response bit, opcode, and flags into a 16-bit flags field. DNS over UDP uses a 2-byte transaction ID for matching responses to queries — understanding that this is a 16-bit value (0x0000 to 0xFFFF) explains why the Kaminsky attack required guessing 65,536 possible values, and why the attack was successful in practice through flooding with multiple simultaneous guesses.
- TCP SYN = 0x002 — connection initiation; SYN scanning sends SYN to each target port
- TCP ACK = 0x010 — acknowledgment; SYN-ACK = 0x012 completes three-way handshake
- TCP RST = 0x004 — abrupt connection reset or refused connection indicator
- TCP FIN = 0x001 — graceful connection termination request
- Wireshark filter for SYN packets:
tcp.flags == 0x002isolates SYN packets in a capture
Bitwise operations — AND, OR, XOR, and NOT — appear throughout security analysis, network programming, and low-level protocol implementation. The AND operation clears bits: applying a subnet mask to an IP address uses bitwise AND to extract the network address (clear all host bits to zero). The command 192.168.1.75 AND 255.255.255.192 extracts the /26 network address 192.168.1.64. The OR operation sets bits: combining the network address with an all-ones host portion using OR produces the broadcast address. NOT inverts all bits: the bitwise NOT of a subnet mask produces the host mask used in broadcast address calculation.
XOR (exclusive OR) is foundational in cryptography and malware obfuscation. XOR produces a 1 when the two input bits differ and a 0 when they are the same. In stream ciphers and one-time pads, plaintext is XOR'd with a keystream to produce ciphertext. XOR'ing the ciphertext with the same keystream recovers the plaintext. Malware frequently uses XOR encoding to obfuscate strings in binary executables, hiding URLs and command-and-control addresses from simple string analysis. During static malware analysis, identifying XOR loops in disassembly and extracting the key is a standard technique for recovering obfuscated strings and configuration data.
- Bitwise AND: used in subnet masking to extract the network address from an IP address
- Bitwise OR: sets specific bits — combines network address with host bits for broadcast calculation
- Bitwise XOR: used in stream ciphers, one-time pads, and malware string obfuscation
- Bitwise NOT: inverts all bits — used to compute the host mask from a subnet mask
- XOR obfuscation in malware: identify XOR loop in disassembly, extract key, decode the string
File type identification based on magic bytes — the first few bytes of a file that identify its format — is a fundamental technique in digital forensics, malware analysis, and security scanning. File extensions are easily changed and cannot be trusted to identify file types reliably. Magic bytes, embedded in the file content itself, provide a reliable type indicator regardless of the filename. Tools like the Unix file command use a database of magic byte signatures to identify file types. Wireshark and network intrusion detection systems use magic bytes to identify protocols and file transfers regardless of port number.
Common magic byte sequences that security analysts should recognize: a file starting with hex 25 50 44 46 (ASCII: %PDF) is a PDF document — PDFs are a common malware delivery vehicle via embedded JavaScript. A file starting with hex 50 4B 03 04 (ASCII: PK) is a ZIP archive — Office .docx, .xlsx, and .jar files are all ZIP-based and may be disguised as other file types. A file starting with hex 7F 45 4C 46 (ASCII: .ELF) is a Linux executable. A file starting with hex FF D8 FF is a JPEG image. A file starting with hex 4D 5A (ASCII: MZ) is a Windows PE executable — the "MZ" comes from Mark Zbikowski, one of the original DOS developers.
- PDF magic bytes:
25 50 44 46(ASCII %PDF) — first four bytes of every valid PDF file - ZIP magic bytes:
50 4B 03 04(ASCII PK) — Office documents (.docx, .xlsx) are ZIP archives - ELF magic bytes:
7F 45 4C 46(ASCII .ELF) — Linux and Unix executable format - Windows PE magic bytes:
4D 5A(ASCII MZ) — all Windows executables and DLL files - Unix file command:
file suspicious.bin— identifies file type by magic bytes regardless of extension