I was wondering how different workplaces track user actions/events from a backend perspective. Popular options include segment, mparticle but they seem to be quite expensive for small/medium level enterprises
Google Cloud Function streaming inserts to BigQuery. Then analysis is carried out with the BigQuery UI (ad hoc analyses or scheduled queries for pipelines), DataStudio (visualizations -- we use this infrequently, it's really not great) or Google Colab (statistical analyses, complex visualizations).
The pros are:
* full SQL access to the data via BigQuery
* simple to set up (yes, you have to write custom code; but a basic implementation is on the order of a couple dozen of lines of code)
* we have full ownership of the data across the pipeline (better for user privacy than using another 3rd party)
Prior to this we used Google Analytics, but their paid solution is too expensive for us and their analytics/aggregation API (though quite powerful) samples a subset of the data which was not acceptable for some of our use cases.
That's what we do at Rakam. If you have less than ~10M events per month, Postgresql works smoothly if you use features such as partitioned tables, parallel queries, and BRIN indexes. The only limitation is that since it's not horizontally scalable, your data must fit in one server. We have SDKs that provide ways to send the event data with the following format:
The event types and attributes are all dynamic. The API server automatically infers the attribute types from the JSON blob and creates the tables which correspond to event types and the columns which correspond to event attributes and inserts the data into that table. It also enriches the events with visitor information such as user agent, location, referrer, etc. The users just run the following SQL query:
SELECT url, count(*) from pageview where _city = 'New York' group by 1
We're using Snowplow and it's quite great. There's a complicated setup process and you have to declare event schema before starting tracking, but it's quite cost effective if that's your priority.
We have millions of rows coming from unique users in realtime at $600/mo, with Segment this would at least be $5000/mo.
We then use Redash to prepare charts, tables, etc for analysis.
Papertrail gem in Ruby on rails. If that doesn’t exist for your platform, just roll your own little table with logging for who did what when. This is so easy to do, why pay thousands per year for some third party to do it and at the same time siphon off possibly sensitive data as a side effect?
Firebase & cloud functions could be a start. I use this to track events and take according actions via cloud functions but they are not user events. In a user scenario it can be used.
Is there a good solution for event/usage tracking for intranet apps without access to Internet? Or is the only option to roll out everything by yourself?
The pros are:
* full SQL access to the data via BigQuery
* simple to set up (yes, you have to write custom code; but a basic implementation is on the order of a couple dozen of lines of code)
* we have full ownership of the data across the pipeline (better for user privacy than using another 3rd party)
Prior to this we used Google Analytics, but their paid solution is too expensive for us and their analytics/aggregation API (though quite powerful) samples a subset of the data which was not acceptable for some of our use cases.