A problem occurred when the client requested too many records, and the server could not handle the records in memory (RAM). Depending on the server heap size, at about 75,000 records the server ran out of memory and became unresponsive. Given our large data sets and the possibility of potentially hundreds of connections, the memory issue had to be rectified. I had to implement data streaming to allow the server to run in constant memory. I decided to use WebSockets because I wanted to implement a cooperative streaming protocol that uses bidirectional communication. This was the start of my journey towards Server Empathy.
The original implementation failed because of a memory overload, but for posterity it is outlined in these three steps:
1) Client sends HTTP request to server asking for all trades fitting a certain criteria from the database
2) Server opens a database connection and reads all relevant rows into memory
3) Server does some work and passes data as response back to client
In order to implement a solution I modified the way the database returns records, and created a communication protocol over WebSockets for client/server communication.
First let’s discuss the database connection. In Java JDBC, when a query is executed, an object of class ResultSet is returned. The ‘Resultset’ is a cursor into the query results prepared by the database server. The original implementation was fetching the entirety of the result rows, bringing them into memory and sending them off to the client.
In the new implementation the server uses a result set function to manage the flow of data from the database to the server. In a series of iterations, the server only fetches a small batch of records from the Resultset. Next the server enriches the data and drops it on the WebSocket for the client, before proceeding on to the next batch. The previous batch automatically gets garbage collected, allowing the server to run in constant memory. Next outlines how I built a client-server communication protocol to minimize the memory burden on the server.
When the client gets some data, it renders the data into the FDT and then sends a message to the server over the WebSocket. The message tells the server whether it should stop or continue sending records. The client has the option to request the termination of the connection at any time for any reason. The server will terminate the connection either when it has received a stop signal from the client, or when the query has completed.
The interaction is detailed in the diagram below (bi-directional arrows represent a web socket):
In this model the server only needs to keep (at maximum) a predetermined number n records in memory at a given time. This alleviates the server load and allows the client to terminate a connection if the application becomes slow or the user navigates away from the page. The chatty relationship between client and server ensures that both sides are always responsive, and any problems will be rectified immediately.
WebSockets may be more difficult to implement than a standard HTTP request, but they provide a lot of flexibility in the client/server interaction. They are particularly effective for communicating large amounts of data, because WebSockets allow for batch streaming while lightening memory loads on both client and server. So if you “care” about server memory, speed, or network bandwidth, there is a strong possibility web sockets may be useful!
If you are interested, here is some information on the technology stack used for this application:
Core.async – The asynchronous library we use on both client and server. Processes messages entering and leaving the web socket.
Re-Frame – A small templating library built on top of ReactJS. It helps to manage the state complexity, while providing a way to program functionally.
Clojure/ClojureScript – Respectively, server and client programming languages we use.
Chord – WebSocket library, built on top of http-kit.