Heuristic Andrew: 2025

When programatically using an AI chatbot API, it is easy to run up big bills. To avoid this, carefully moniter token usage, but resist the urge to estimate it by simply dividing character count. It may be inaccurate unless the following are true:

You use the same tokenizer. For example, Claude 3.5 Sonnet and Haiku 3.5 use the same tokenizer, but each vendor uses their own tokenizer.
Your inputs are always have the same proportion of text and images. If always text, that is a good start. If a short text plus image of same dimensions, that might work.
Your input prompts have low diversity. If some prompts are in English and others in Japanese, then the ratios will be wrong. Likewise, if some are mostly English while others are mostly source code, then your estimates will be poor.

Token counts for various documents

To make this more concrete, I tested two tokenizers across a variety of prompts:

Text type	characters	Amazon Nova Lite tokens	Amazon characters/tokens ratio	Claude 3 Haiku tokens	Amazon vs Claude difference %
Python code	14,914	4,331	3.4	4,648	7%
news article English	6,555	6,034	1.1	6,104	1%
privacy policy in English and HTML	90,179	31,436	2.9	32,937	5%
JPEG photo, width 500px	74,660	970	77.0	503	-48%
Wikipedia article in Japanese	1,793	2,754	0.7	2,014	-27%
Different Wikipedia article in Chinese	2,715	5,717	0.5	4,662	-18%
Linux /var/log	8,543	3,942	2.2	3,503	-11%
minified javascript	20,856	9,294	2.2	9,366	1%
The Raven by Poe	6,838	1,628	4.2	1,900	17%
Russian poem in Cyrillic	591	277	2.1	274	-1%
SQL DML	9,265	5,843	1.6	4,851	-17%
Assembly code	10,621	4,265	2.5	4,842	14%
UTF-8 sampler	10,467	5,338	2.0	5,979	12%
Thai Wikipedia	7,645	6,223	1.2	6,592	6%
Bengali	7,191	6,119	1.2	6,624	8%
Microsoft policy in Korean	6,945	6,775	1.0	6,348	-6%
rare words, English	9,157	1,995	4.6	2,339	17%

Your results may vary, so run some tests. Tip: use your favorite API's "playground" function as a quick way to benchmark token counts without writing more code.

Correct code

Here is example code for accurately getting token counts from Amazon Bedrock.

    response = bedrock.converse(
        modelId=full_model_id,
        messages=conversation,
        inferenceConfig={"maxTokens": max_tokens, "temperature": temperature, "topP": top_p},)

    output_text = response["output"]["message"]["content"][0]["text"]
    output_characters = len(ret.output_text)
    input_tokens = response["usage"]["inputTokens"]
    output_tokens = response["usage"]["outputTokens"]

For OpenAI ChatGPT, check the usage field in the JSON structure.

Code for Google Gemini.

from google import genai

client = genai.Client()
prompt = "The quick brown fox jumps over the lazy dog."

# Count tokens using the new client method.
total_tokens = client.models.count_tokens(
    model="gemini-2.0-flash", contents=prompt
)
print("total_tokens: ", total_tokens)
# ( e.g., total_tokens: 10 )

response = client.models.generate_content(
    model="gemini-2.0-flash", contents=prompt
)

# The usage_metadata provides detailed token counts.
print(response.usage_metadata)
# ( e.g., prompt_token_count: 11, candidates_token_count: 73, total_token_count: 84 )

After my previous post where I compared the TP-Link Archer AX6000, MSI RadiX AXE6600, and TP-Link Deco P9, I decided to sell the MSI RadiX AXE6000. I was happy with the Archer AX6000, but sometimes our Samsung Galaxy S22 and S24 devices lost their connection, while 30 other devices on the same network were fine. Even the Galaxy A13/A15 phones were fine. After troubleshooting, I could not figure out it. With the age of the AX6000, I didn't expect a firmware update would help, so on eBay I bought a Netgear Nighthawk RAXE300 6e on a good deal. It was newer, so I expected better compatability.

Attribute	Netgear RAXE300	TP-Link AX6000	TP-Link Deco P9
Brand	Netgear	TP-Link	TP-Link
Model	RAXE300	AX6000	Deco P9
Paid Price	$92.46 in 2024, eBay used	$63.49 in 2024, eBay used	$250.00 in 2020, new
Wi-Fi Generation	6E	6	5
Beamforming	Yes	Yes	Yes
Streams	2/4/2	4/4/0	MU-MIMO
Antennas	6	8	2
2.5 Gbps LAN Ports	1	1	0 per unit
1 Gbps LAN Ports	5	8	2 per unit

Testing Setup

All tests were conducted using iperf3, connected to an OpenWRT router over 1 Gbps ethernet. I ran iperf3 on the OpenWRT router as the server, and two client devices (a Galaxy S24 and a Dell Precision 3570) at different locations. I repeated most test three times and averaged the results, though some were two or four times. I chose iperf3 to avoid confounding effects of WAN performance. In iperf3 terminology, the normal mode, which I call forward, uploads from the client to the server, and reverse mode downloads from server to client. I did all tests November 2024.

Close range

Close range testing gives us information about ideal connections, not worrying about distance or barriers like walls and floors. It sets a benchmark for tests at farther distances.

Access Point	Client	Band	Ping Jitter (ms)	Ping Avg (ms)	iperf Forward (Mbps)	iperf Reverse (Mbps)
Nighthawk RAXE300	Galaxy S24	6 GHz	7.8	18.3	811	831
Nighthawk RAXE300	Galaxy S24	5 GHz	10.3	21.7	556	738
Nighthawk RAXE300	Galaxy S24	2.4 GHz	20.0	29.7	56	64
TP-Link Archer AX6000	Galaxy S24	5 GHz	8.0	14.7	503	752
TP-Link Archer AX6000	Galaxy S24	2.4 GHz	18.5	28.0	78	75
TP-Link Deco P9	Galaxy S24	5 GHz	8.5	17.5	243	389
TP-Link Deco P9	Galaxy S24	2.4 GHz	36.5	48.5	13	16
Nighthawk RAXE300	Dell Precision 3570	5 GHz		8.0	331	320
TP-Link Archer AX6000	Dell Precision 3570	5 GHz		3.0	500	377
TP-Link Deco P9	Dell Precision 3570	5 GHz		4.0	306	296

Nighthawk RAXE300 Leads on 6 GHz: The Nighthawk RAXE300 demonstrated the highest average throughput with the Galaxy S24 in both forward (811 Mbps) and reverse (831 Mbps) directions when connected to the 6 GHz band. This highlights the potential of the newer 6 GHz technology for close-range, high-bandwidth applications. However, even in 2025, few of my devices support 6 GHz.
Competitive 5 GHz Performance: 5 GHz performance is the most significant because that's the primary band for most of my non-IOT devices. Both the Nighthawk RAXE300 and the TP-Link Archer AX6000 showed strong performance on the 5 GHz band. However, the Galaxy S24 exhibited concerningly higher average ping times with the Nighthawk RAXE300 (21.7 ms) and TP-Link Deco P9 (17.5 ms) compared to the Archer AX6000 (14.7 ms). This echoes a pattern I've observed previously, where Galaxy S24 devices seem to have poor ping times (see this Reddit thread). When focusing on ping times for the Dell laptop, which better represents ping for most devices, the Nighthawk had a worse average ping (8ms) than the other two APs (3 ms for Archer AX6000 and 4 ms for Deco P9). In terms of throughput on 5 GHz, the Archer AX6000 averaged 503 Mbps forward and 752 Mbps reverse with the Galaxy S24, compared to the RAXE300's 556 Mbps forward and 738 Mbps reverse. With the Dell laptop, the Archer AX6000 had better throughput than the RAXE300.
Deco P9 Lagging on 5 GHz: The TP-Link Deco P9 showed noticeably lower average throughput on the 5 GHz band (243 Mbps forward, 389 Mbps reverse with the Galaxy S24; 306 Mbps forward, 296 Mbps reverse with the Dell Precision 3570) compared to the other two access points.
Lower 2.4 GHz Performance Across the Board: As expected, all three access points exhibited significantly lower throughput on the 2.4 GHz band compared to the 5 GHz and 6 GHz bands. The TP-Link Deco P9 had the lowest performance on 2.4 GHz among the three with the Galaxy S24.

Corner in basement

The following table summarizes the performance in a corner of my basement where my son has his PC, but it's where wi-fi has a poor signal.

```html

Performance Testing Results - Basement NE Corner (Nov 3, 2024)

Access Point	Band	Avg Ping (ms)	iperf Forward (Mbps)	iperf Reverse (Mbps)
Nighthawk RAXE300	6 GHz	49.3	15	104
Nighthawk RAXE300	5 GHz	23.3	211	279
TP-Link Archer AX6000	5 GHz	17.4	146	288
TP-Link Deco P9	5 GHz	18.5	87	143

Of all the APs, the Nighthawk suffered the most ping performance compared to close range. Also at the same locaiton, the Nighthawk had the worse ping, especially on 6GHz.
Likewise, performance on the 6 GHz suffered the most in the basement compared to close range.
Performance on 5 GHz was acceptable on the Nighthawk and Archer.

Access Point	Band	Avg Ping (ms)	iperf Forward (Mbps)	iperf Reverse (Mbps)
Nighthawk RAXE300	2.4 GHz	38.4	22.74	28.66
TP-Link Archer AX6000	2.4 GHz	33.2	38.55	44.8714
TP-Link Deco P9	2.4 GHz	56.75	12.95	11.515

Compared to close range at 2.4 GHz, all metrics suffered, as expected because of the physical barriers.
The Archer AX6000 suffered the least lost compared to close range at 2.4 GHz.
Both the Nighthawk and Archer had acceptable performance when tested on the driveway.

Conclusion

These tests gave me the data to finally say goodbye to my Deco P9, which I sold for a good price on Nextdoor. It makes me wonder in general the value of any mesh network. If you have wi-fi issues, don't count on a mesh network to fix them.

While the newer Netgear Nighthawk did not consistently outperform the Archer, I gave the Archer to a family member who moved into a new place. Her phone did not have the comptability issues I had with the Archer. I've been using the Netgear Nighthawk AXE600 for over six months, and I am pleased with it.

Heuristic Andrew

Monday, May 12, 2025

Estimating token counts from character length