The Right Way to Squeeze Out CPU Performance in Web Apps
This article was translated, original article here.
Browser Thread Allocation
Modern browsers generally adopt a multi-process architecture, separating functions like network I/O, storage, and plugins into different processes. Crucially, an independent process is created for each newly opened tab. The benefits are obvious – a frozen page won’t freeze the entire browser, nor affect other pages (with exceptions; when system resources are insufficient to create new processes, the browser may merge some tabs into the same process).
A web application runs within a tab process, known as the Renderer Process. This process handles various tasks for the entire webpage’s operation. Typically, it maintains the following threads:
- GUI Rendering Thread
- JavaScript Thread
- Timer Trigger Thread
- Event Trigger Thread
- HTTP Asynchronous Request Thread
Among these, the GUI thread and the JavaScript thread are mutually exclusive. When GUI rendering starts, JavaScript parsing stops; conversely, when JavaScript is executing, GUI rendering is suspended.
image from Inside look at modern web browser (part 1)
The Pros and Cons of JavaScript’s Single Thread
JavaScript was designed for single-threaded execution. This design choice meant browsers didn’t need to handle the complexity of resource contention (like DOM manipulation) from multiple threads simultaneously.
But, Gul’dan, what is the cost? Because the JavaScript thread and GUI thread are mutually exclusive, when the JavaScript thread executes even a slightly time-consuming calculation, the UI is blocked, and the application freezes. An even bigger issue was that web applications couldn’t fully utilize CPU resources, meaning they couldn’t handle heavy computation scenarios. For a long time, this relegated web development to “flyweight” or even “strawweight” applications (referring to the lightest weight classes in professional boxing).
Workers and Multithreading
To solve this problem, Web Workers emerged as part of the HTML5 standard and were integrated into browser engines.
- Worker threads are created by the main thread or another Worker thread and run independently. The host that creates a Worker can terminate it at any time.
- The main thread and Worker threads have completely separate contexts; data is isolated, they cannot directly access each other’s variables, and communication requires dedicated interfaces.
- The Worker thread context is almost identical to the main thread’s, but Workers cannot access the DOM.
Despite browsers having multithreading capabilities, features like data isolation and the Worker’s inability to manipulate the DOM preserve the essential single-threaded nature of JavaScript. This allows the main thread to focus on handling user interactions, while computationally intensive or high-latency tasks are offloaded to Worker threads, keeping the UI fluid and the user experience smooth.
Web Workers API
Web Workers are implemented through the Web Workers API provided by the host. Notably, Web Workers themselves support the Web Workers API, meaning we can nest Workers.
A new worker thread is created by instantiating a Worker
object. The Worker is attached to the window object (within a Worker, it’s the self object). Before use, check if the current browser supports Workers:
if (!window.Worker) {
alert('Current browser does not support Worker');
} else {
// Your code
}
Worker is a constructor that takes two parameters:
- aURL: A USVString parameter specifying the script file path. Important: This script must be from the same origin.
- options: Configuration parameters.
let worker = new Worker('worker.js', options);
Communication between the main thread and the worker thread happens via events and is bidirectional. The sender uses postMessage()
to send data, and the receiver registers an onmessage
event handler to receive it, similar to iframe communication.
// Main Thread
worker.postMessage(message, [transferList]);
// Worker Thread (worker.js)
self.onmessage = function(evt) {
// Access data via the 'data' property
const { data } = evt;
// ... other code ...
};
Data transmission is done via deep copy – the sender’s data is deeply copied to the receiver. However, the Worker-specific postMessage()
is more “advanced” than the iframe version. Besides the data to send, it also supports a transfer array parameter.
The transfer accepts Transferable
objects, which include: ArrayBuffer
, MessagePort
, ImageBitmap
, and OffscreenCanvas
. Transferable
is a special data type distinct from regular JavaScript data; it stores binary data. When the host specifies data in the transfer, it transfers ownership of the underlying binary data handle to the worker thread. No data copying occurs. After transfer, the worker thread takes control of the data block pointed to by the handle, and the host loses access. This is similar to passing a pointer in C, but with transfer, only one thread can access the data at a time, avoiding contention.
When a worker thread finishes its task and is no longer needed, it should be closed to conserve system resources:
// Main Thread
worker.terminate();
Multithreading Programming Practice
## Optimal Number of Threads
Generally, utilizing all available CPU threads for computation when the OS is idle yields the fastest results. The BOM provides a read-only property, window.navigator.hardwareConcurrency
, specifically for querying the number of logical processors available on the current machine. Before creating multiple threads, query this parameter to determine the maximum thread count.
During program execution, it’s difficult to monitor the number of idle CPU threads in real-time, making dynamic thread allocation challenging. The author’s experience suggests: always start the maximum number of threads. The task of determining how many threads the CPU can actually use and how to allocate them can be left to the browser.
Thread Pool
When a time-consuming module needs to be called repeatedly, create a Worker thread pool for it. Each element in the pool corresponds to one Worker thread and its current usage status (inUse).
let workerList = [];
// Query CPU thread count, create thread pool
for (let i = 0; i < window.navigator.hardwareConcurrency; i++) {
let newWorker = {
worker: new Worker('cpuworker.js'),
inUse: false
};
workerList.push(newWorker);
}
When inUse
is false
, the thread is idle. Find an idle Worker, execute postMessage()
, and simultaneously set inUse
to true
, marking it as busy so other tasks cannot use it. When the Worker finishes execution and returns data to the main thread, the main thread releases that Worker in the pool by setting inUse
back to false
. If all threads in the pool are busy, tasks need to wait until an idle thread becomes available.
## Nested Workers
When dealing with massive data processing, we can split the large dataset into smaller chunks, send them to individual Workers for computation, and finally combine the results. However, splitting and concatenating very large arrays on the main thread can also block the UI. Simply moving the computation into a single Worker only shifts the bottleneck; performance doesn’t improve. Therefore, a nested Worker strategy should be employed:
- The main thread creates a Master Worker dedicated to data splitting and concatenation.
- During splitting, this Master Worker dynamically creates Sub-Workers to perform the actual computation.
- The Master Worker then aggregates the results from the Sub-Workers.
- The Master Worker sends the final result back to the main thread.
// Main Thread
const mWorker = new Worker('mainWorker.js');
mWorker.postMessage(data, [data.buffer]); // Transfer ArrayBuffer ownership
mWorker.onmessage = function(e) {
const { data: processedData } = e;
mWorker.terminate(); // Close Master Worker, reclaim resources
};
// mainWorker.js
self.onmessage = function(e) {
const { data: originalData } = e;
// Master Worker uses one thread, subtract one from available count
const cpuNum = self.navigator.hardwareConcurrency - 1;
let sliceSize = Math.floor(originalData.length / cpuNum);
// Allocate memory for the result (assuming originalData is TypedArray/Buffer)
const resultData = new originalData.constructor(originalData.length);
let workersInUse = cpuNum;
for (let i = 0; i < cpuNum; i++) {
const start = i * sliceSize;
const end = (i === cpuNum - 1) ? originalData.length : start + sliceSize; // Handle last chunk
// Extract slice (assuming originalData is TypedArray/Buffer)
const chunkData = originalData.subarray(start, end);
const subWorker = new Worker('subWorker.js');
subWorker.postMessage(chunkData, [chunkData.buffer]); // Transfer ownership
subWorker.onmessage = function(e) {
const { data: processedChunk } = e;
// Place processed chunk into result
resultData.set(processedChunk, start);
workersInUse--;
subWorker.terminate(); // Close Sub-Worker
if (workersInUse <= 0) {
// All Sub-Workers done, send final result back
self.postMessage(resultData, [resultData.buffer]); // Transfer ownership
}
};
}
};
// subWorker.js
self.onmessage = function(e) {
const { data: chunk } = e;
// Perform the time-consuming computation on this chunk
const processedChunk = someHeavyFunction(chunk);
self.postMessage(processedChunk, [processedChunk.buffer]); // Transfer ownership
};
Can the number of Workers exceed the CPU’s available threads? Technically, yes. The browser will manage the actual thread scheduling. However, creating more Workers than the hardware can handle efficiently will not improve performance and may even degrade it.
Multithreading Performance Comparison
Here’s a benchmark example using the nested Worker strategy described above:
- Task: Traverse a large bitmap (resolution 7451×4192), find pixels where the color value is not #000000, draw them with another color, and add noise to pixels that are #000000.
- Comparison: Performance between single-thread (main thread), single Worker thread, and multiple Worker threads.
here is the scheme
Results:
- Main Thread Parsing: The page becomes unresponsive for over 1.3 seconds.
- One Worker Thread: Takes about 1.3 seconds. While the page doesn’t freeze, there’s no significant parsing performance gain.
- Multiple Worker Threads: Takes only 240 milliseconds. Performance improvement is very significant (test device had 16 logical cores), achieving a ~82% speedup.
Limitations of Workers
Using Workers is straightforward, but there are limitations:
Same-Origin Restriction
Worker scripts must be from the same origin and do not support the file://
protocol. You must use a server to debug Worker programs. Deploying scripts to a CDN (a different domain than the app) can cause Worker loading to fail due to CORS.
Solution: Convert the Worker code string into a Blob
and create an Object URL:
let script = `console.log('hello world!');`;
let workerBlob = new Blob([script], { type: 'text/javascript' });
let url = URL.createObjectURL(workerBlob);
let worker = new Worker(url);
Alternatively (less efficient): Compile the JS string to base64 and use a data:
URL.
Engineering Tooling
Modern frontend projects use tools like Webpack or Vite. Refer to webpack worker-loader and vite web-workers documentation for integration methods.
API Access Restrictions
Workers have a global context (self) similar to the main thread but cannot access the DOM or use methods like alert()
, confirm()
. They can use console
and debugger
for debugging. Refer to documentation for detailed global scope differences.