Performance¶
Benchmarked against 2,179 test files from the chardet test suite. All detectors evaluated with the same equivalence rules. Numbers below are pure Python (CPython 3.12) unless noted.
Accuracy¶
Detector |
Correct |
Accuracy |
Speed |
|---|---|---|---|
chardet 7.0 (mypyc) |
2110/2179 |
96.8% |
494 files/s |
chardet 7.0 (pure) |
2110/2179 |
96.8% |
336 files/s |
chardet 6.0.0 |
2060/2179 |
94.5% |
12 files/s |
charset-normalizer |
1942/2179 |
89.1% |
66 files/s |
cchardet |
1245/2179 |
57.1% |
1,801 files/s |
chardet leads all detectors on accuracy: +2.3pp vs chardet 6.0.0, +7.7pp vs charset-normalizer, and +39.7pp vs cchardet.
Speed¶
Detector |
Files/s |
Mean |
Median |
p90 |
p95 |
|---|---|---|---|---|---|
cchardet |
1,811 |
0.55ms |
0.07ms |
0.65ms |
0.92ms |
chardet 7.0 (mypyc) |
494 |
2.02ms |
0.61ms |
3.86ms |
4.49ms |
chardet 7.0 (pure) |
336 |
2.98ms |
1.11ms |
5.37ms |
6.21ms |
charset-normalizer |
66 |
15.17ms |
4.67ms |
49.31ms |
70.71ms |
chardet 6.0.0 |
12 |
83.19ms |
16.32ms |
122.32ms |
319.77ms |
With mypyc compilation, chardet 7.0 is 41x faster than chardet 6.0.0 and 7.5x faster than charset-normalizer. Even the pure-Python build is 28x faster than chardet 6.0.0 and 5.1x faster than charset-normalizer. Median time per file is 0.61ms (mypyc) / 1.11ms (pure).
Memory¶
Detector |
Import Memory |
Peak Memory |
RSS |
|---|---|---|---|
chardet 7.0 |
96 B |
22.5 MiB |
96.1 MiB |
chardet 6.0.0 |
96 B |
16.4 MiB |
101.8 MiB |
charset-normalizer |
1.3 MiB |
101.8 MiB |
265.7 MiB |
cchardet |
23.6 KiB |
27.2 KiB |
62.8 MiB |
chardet uses negligible import memory (96 B), 4.5x less peak memory than charset-normalizer, and 2.8x less RSS.
Language Detection¶
Detector |
Correct |
Accuracy |
|---|---|---|
chardet 7.0 |
1964/2171 |
90.5% |
chardet 6.0.0 |
1016/2171 |
46.8% |
charset-normalizer |
0/2171 |
0.0% |
cchardet |
0/2171 |
0.0% |
chardet detects the language for every file. charset-normalizer and cchardet do not report language.
Thread Safety¶
chardet.detect() and chardet.detect_all() are fully thread-safe.
Each call carries its own state with no shared mutable data between threads.
Thread safety adds no measurable overhead (< 0.1%).
On free-threaded Python (3.13t+, GIL disabled), detection scales with threads:
Threads |
Time |
Speedup |
|---|---|---|
1 |
4,361ms |
baseline |
2 |
2,337ms |
1.9x |
4 |
1,930ms |
2.3x |
Individual UniversalDetector instances are not thread-safe.
Create one instance per thread when using the streaming API.
Optional mypyc Compilation¶
Prebuilt mypyc-compiled wheels are
published to PyPI for CPython on Linux, macOS, and Windows. A regular
pip install chardet will pick them up automatically — no extra flags
needed.
Build |
Files/s |
Speedup |
|---|---|---|
Pure Python |
332 |
baseline |
mypyc compiled |
494 |
1.49x |
Pure-Python wheels are always available for PyPy and platforms without prebuilt binaries.
Performance Across Python Versions¶
Benchmarked chardet 7.0.0rc4 from PyPI across all supported Python
versions (macOS aarch64, 2,179 files, encoding_era=ALL). CPython
versions install mypyc-compiled wheels automatically; PyPy receives
the pure-Python wheel.
Python |
Wheel |
Import |
Total |
Files/s |
Mean |
Median |
p90 |
p95 |
|---|---|---|---|---|---|---|---|---|
CPython 3.10 |
mypyc |
2.509s |
4,257ms |
512 |
1.95ms |
0.60ms |
3.72ms |
4.28ms |
CPython 3.10 |
pure |
0.038s |
8,172ms |
267 |
3.75ms |
1.41ms |
6.89ms |
7.79ms |
CPython 3.11 |
mypyc |
2.736s |
3,815ms |
571 |
1.75ms |
0.52ms |
3.41ms |
3.89ms |
CPython 3.11 |
pure |
0.040s |
6,345ms |
343 |
2.91ms |
1.09ms |
5.34ms |
6.20ms |
CPython 3.12 |
mypyc |
2.930s |
4,455ms |
489 |
2.04ms |
0.62ms |
3.87ms |
4.44ms |
CPython 3.12 |
pure |
0.018s |
6,567ms |
332 |
3.01ms |
1.13ms |
5.35ms |
6.18ms |
CPython 3.13 |
mypyc |
2.755s |
4,678ms |
466 |
2.15ms |
0.63ms |
4.07ms |
4.71ms |
CPython 3.13 |
pure |
0.054s |
8,666ms |
251 |
3.98ms |
1.46ms |
7.01ms |
7.91ms |
CPython 3.14 |
mypyc |
2.666s |
4,656ms |
468 |
2.14ms |
0.64ms |
4.07ms |
4.75ms |
CPython 3.14 |
pure |
0.013s |
6,525ms |
334 |
2.99ms |
1.12ms |
5.43ms |
6.24ms |
PyPy 3.10 |
pure |
0.031s |
5,392ms |
404 |
2.47ms |
0.31ms |
4.97ms |
5.52ms |
PyPy 3.11 |
pure |
0.138s |
5,409ms |
403 |
2.48ms |
0.30ms |
4.98ms |
5.52ms |
CPython 3.11 + mypyc is the fastest combination at 571 files/s. mypyc provides a 1.4–1.9x speedup across CPython versions. PyPy’s JIT is competitive with mypyc: pure Python on PyPy (404 files/s) beats every pure CPython version and reaches 70–85% of mypyc-compiled CPython throughput.
mypyc builds pay ~2.5–2.9s upfront for import (loading compiled .so
extensions and lazy model initialization) vs 13–54ms for pure-Python
builds. This is a one-time cost amortized over all subsequent
detect() calls.