Performance¶
Benchmarked against 2,510 test files from the chardet test suite. All detectors evaluated with the same equivalence rules. Numbers below are CPython 3.12 unless noted.
Accuracy¶
Detector |
Correct |
Accuracy |
Speed |
|---|---|---|---|
chardet 7.1.0 (mypyc) |
2464/2510 |
98.2% |
533 files/s |
chardet 7.1.0 (pure) |
2464/2510 |
98.2% |
372 files/s |
chardet 6.0.0 |
2216/2510 |
88.3% |
12 files/s |
charset-normalizer |
2114/2510 |
84.2% |
129 files/s |
cchardet |
1406/2510 |
56.0% |
1,662 files/s |
chardet leads all detectors on accuracy: +9.9pp vs chardet 6.0.0, +14.0pp vs charset-normalizer, and +42.2pp vs cchardet.
Speed¶
Detector |
Files/s |
Mean |
Median |
p90 |
p95 |
|---|---|---|---|---|---|
cchardet |
1,662 |
0.60ms |
0.05ms |
0.70ms |
1.18ms |
chardet 7.1.0 (mypyc) |
533 |
1.88ms |
0.57ms |
4.44ms |
5.86ms |
chardet 7.1.0 (pure) |
372 |
2.69ms |
0.99ms |
5.89ms |
7.71ms |
charset-normalizer (mypyc) |
129 |
7.75ms |
2.68ms |
22.47ms |
39.36ms |
charset-normalizer (pure) |
67 |
15.00ms |
5.11ms |
43.44ms |
77.37ms |
chardet 6.0.0 |
12 |
82.70ms |
2.31ms |
180.02ms |
372.50ms |
With mypyc compilation, chardet 7.1.0 is 44x faster than chardet 6.0.0 and 4.1x faster than charset-normalizer (mypyc). Even the pure-Python build is 31x faster than chardet 6.0.0 and 5.6x faster than charset-normalizer (pure). Median time per file is 0.57ms (mypyc) / 0.99ms (pure).
Memory¶
Detector |
Import Time |
Import Memory |
Peak Memory |
RSS |
|---|---|---|---|---|
chardet 7.1.0 |
0.015s |
1.9 MiB |
25.9 MiB |
115.6 MiB |
chardet 6.0.0 |
0.053s |
13.0 MiB |
29.5 MiB |
122.3 MiB |
charset-normalizer |
0.010s |
1.4 MiB |
101.3 MiB |
273.3 MiB |
cchardet |
0.001s |
23.2 KiB |
26.8 KiB |
81.9 MiB |
chardet uses 3.9x less peak memory than charset-normalizer and 2.4x less RSS.
Language Detection¶
Detector |
Correct |
Accuracy |
|---|---|---|
chardet 7.1.0 |
2383/2502 |
95.2% |
charset-normalizer |
1476/2502 |
59.0% |
chardet 6.0.0 |
1002/2502 |
40.0% |
cchardet |
0/2502 |
0.0% |
chardet detects language with 95.2% accuracy — +36.2pp vs charset-normalizer and +55.2pp vs chardet 6.0.0. cchardet does not report language.
Thread Safety¶
chardet.detect() and chardet.detect_all() are fully thread-safe.
Each call carries its own state with no shared mutable data between threads.
Thread safety adds no measurable overhead (< 0.1%).
On free-threaded Python (GIL disabled), detection scales with threads.
Standard GIL Python shows no scaling — the GIL serializes threads.
Benchmarked with 2,510 files, encoding_era=ALL:
Python |
1 thread |
2 threads |
4 threads |
8 threads |
|---|---|---|---|---|
3.13 (pure) |
8,740ms |
8,720ms |
8,700ms |
8,700ms |
3.13t (pure) |
10,150ms |
8,530ms (1.2x) |
4,360ms (2.3x) |
5,050ms (2.0x) |
3.13 (mypyc) |
4,650ms |
4,700ms |
4,760ms |
4,760ms |
3.13t (mypyc) |
4,630ms |
2,350ms (2.0x) |
1,300ms (3.6x) |
1,120ms (4.1x) |
3.14 (pure) |
6,460ms |
6,490ms |
6,510ms |
6,500ms |
3.14t (pure) |
7,280ms |
6,330ms (1.2x) |
3,000ms (2.4x) |
1,830ms (4.0x) |
3.14 (mypyc) |
4,700ms |
4,690ms |
4,670ms |
4,670ms |
3.14t (mypyc) |
5,390ms |
2,690ms (2.0x) |
1,450ms (3.7x) |
1,160ms (4.6x) |
Individual UniversalDetector instances are not thread-safe.
Create one instance per thread when using the streaming API.
Optional mypyc Compilation¶
Prebuilt mypyc-compiled wheels are
published to PyPI for CPython on Linux, macOS, and Windows. A regular
pip install chardet will pick them up automatically — no extra flags
needed.
Build |
Files/s |
Speedup |
|---|---|---|
Pure Python |
390 |
baseline |
mypyc compiled |
581 |
1.49x |
Pure-Python wheels are always available for PyPy and platforms without prebuilt binaries.
Performance Across Python Versions¶
Benchmarked chardet 7.1.0 across all supported Python versions
(macOS aarch64, 2,510 files, encoding_era=ALL). CPython versions
install mypyc-compiled wheels automatically; PyPy receives the
pure-Python wheel.
Python |
Wheel |
Total |
Files/s |
Mean |
Median |
p90 |
p95 |
|---|---|---|---|---|---|---|---|
CPython 3.10 |
mypyc |
4,336ms |
579 |
1.73ms |
0.54ms |
4.15ms |
5.16ms |
CPython 3.10 |
pure |
7,987ms |
314 |
3.18ms |
1.16ms |
7.03ms |
9.02ms |
CPython 3.11 |
mypyc |
3,746ms |
670 |
1.49ms |
0.46ms |
3.56ms |
4.51ms |
CPython 3.11 |
pure |
6,490ms |
387 |
2.59ms |
0.94ms |
5.73ms |
7.44ms |
CPython 3.12 |
mypyc |
4,320ms |
581 |
1.72ms |
0.54ms |
4.02ms |
5.16ms |
CPython 3.12 |
pure |
6,433ms |
390 |
2.56ms |
0.95ms |
5.55ms |
7.20ms |
CPython 3.13 |
mypyc |
4,703ms |
534 |
1.87ms |
0.58ms |
4.37ms |
5.67ms |
CPython 3.13 |
pure |
8,560ms |
293 |
3.41ms |
1.28ms |
7.44ms |
9.44ms |
CPython 3.14 |
mypyc |
4,616ms |
544 |
1.84ms |
0.58ms |
4.36ms |
5.61ms |
CPython 3.14 |
pure |
6,514ms |
385 |
2.60ms |
0.97ms |
5.67ms |
7.43ms |
PyPy 3.10 |
pure |
6,194ms |
405 |
2.47ms |
0.26ms |
5.15ms |
7.37ms |
PyPy 3.11 |
pure |
6,128ms |
410 |
2.44ms |
0.26ms |
5.08ms |
7.83ms |
CPython 3.11 + mypyc is the fastest combination at 670 files/s. mypyc provides a 1.4–1.8x speedup across CPython versions. PyPy’s JIT is competitive with mypyc: pure Python on PyPy (410 files/s) beats every pure CPython version and reaches 61–100% of mypyc-compiled CPython throughput.