Benchmark Rust libraries working with HTML: - [`scraper`](https://crates.io/crates/scraper) (built with [`html5ever`](https://crates.io/crates/html5ever)) - [`tl`](https://crates.io/crates/tl) - *(haven't found anything else that can build DOM, select an element and serialize)* Output is unformatted CSV (probably because I'm lazy), a nushell's `from csv` can be used to print a neat table. Test results on my PC: ``` ~/code/html-rs-bench> cargo run -r | save -f result; open result | from csv Finished `release` profile [optimized] target(s) in 0.03s Running `target/release/html-rs-bench` ╭───┬───────────┬─────────┬───────────┬───────────┬───────────┬────────────┬────────────┬────────────┬────────────┬────────────┬────────────╮ │ # │ engine │ page │ parse min │ parse avg │ parse max │ select min │ select avg │ select max │ serial min │ serial avg │ serial max │ ├───┼───────────┼─────────┼───────────┼───────────┼───────────┼────────────┼────────────┼────────────┼────────────┼────────────┼────────────┤ │ 0 │ html5ever │ tochb │ 77.42 │ 78.08 │ 81.70 │ 0.00 │ 0.00 │ 0.02 │ 67.16 │ 67.75 │ 69.38 │ │ 1 │ html5ever │ android │ 123.75 │ 124.81 │ 126.88 │ 0.00 │ 0.00 │ 0.02 │ 66.44 │ 66.74 │ 66.99 │ │ 2 │ html5ever │ 10mb │ 135.00 │ 135.34 │ 136.73 │ 0.00 │ 0.00 │ 0.02 │ 248.11 │ 248.69 │ 249.58 │ │ 3 │ tl │ tochb │ 19.60 │ 19.72 │ 20.00 │ 0.00 │ 0.00 │ 0.02 │ 1024.16 │ 1025.30 │ 1027.68 │ │ 4 │ tl │ android │ 11.68 │ 11.78 │ 11.96 │ 0.00 │ 0.00 │ 0.02 │ 82.70 │ 84.07 │ 92.40 │ │ 5 │ tl │ 10mb │ 2.09 │ 2.14 │ 2.24 │ 0.00 │ 0.00 │ 0.00 │ 14.04 │ 14.19 │ 14.56 │ ╰───┴───────────┴─────────┴───────────┴───────────┴───────────┴────────────┴────────────┴────────────┴────────────┴────────────┴────────────╯ ``` Keep in mind that, despite `tl` being really fast (except serializing many tags in `tochb` sample), it's not a browser-grade parser, and also that it doesn't fully support the CSS selector syntax (e.g. `a[data-abc]` is OK, but not `a[data-abc="123"]`). I do not own anything in the sample HTML pages. Here are the sources: - `tochb` - [Dictionary of Tocharian B](https://www.win.tue.nl/~aeb/natlang/ie/tochB.html) - `android` - [BluetoothDevice Android API reference](https://developer.android.com/reference/android/bluetooth/BluetoothDevice) - `10mb` - [Random 10MB page with Shakespeare's poem](https://github.com/adriancbjie/my-kitty-cat/raw/master/web/staticpages/10MB.html) Size of developer.android.com is surprising... It's ~3.8 MiB even for the API reference homepage. It's bigger than the whole Tocharian-B dictionary.