Hosting a Collection Server
Last updated
With the goal of being as transparent as we can, It’s possible to host your own instance of a Pushbroom data collection server. The source code for the server is available on GitHub under an MIT open source license.
The server is a SvelteKit application that only runs route endpoints functions. This allows the application to built as either a collection of serverless functions or a traditional Node.js app, whichever is appropriate for your deployment needs. Included in the repository is a Dockerfile
for building a container image of the server, as well as a fly.toml
configuration file deploying that container to Fly.io. Pushbroom also can deploy as-is to Vercel.
Setting up a self-hosted instance of Pushbroom requires a few small configuration steps.
Setting the Environment Variables for your Triplestore.
In order to run your own collection server, you need to host your own instance of a triplestore to store the data you’re collecting. Once you’ve done that, you can set the following .env
variables to reference your store and credentials”:
sparql_endpoint=https://stucco-proxy.fly.dev
sparql_user=username
sparql_password=password
Setting the Collection Endpoint URL.
The file at /static/ping.js
contains the client-side script that runs in the browser and sends data to pushbroom. At the very last line of this file, you can see where the function is being initialized with the URL of the collection server:
…(window, document, 'ping.pushbroom.co')
Change ping.pushbroom.co
to the URL you will be hosting the server.
Now when the script it loaded, it will start sending analytics data to your deployment of Pushbroom, which will persist the data in your own hosted triplestore.
Running the Pushbroom Analytics Application Against Your Data.
Since all your data is stored in a standards-implementing triplestore, it’s all freely accessible via SPARQL as RDF. You can run your own analysis tools against your data, pipe it into different applications, or just download it all for static analysis.
In order to use the Pushbroom Analytics application to understand your data, you’ll have to use a white-labeled deployment that communicates with your – and only your – hosted servers. If you’re interested in this solution, please reach out to us for more information!