Xebia is a driven IT service company, operating on agile principles. It specializes in Big Data, Web, Cloud, Java architectures and transition to agile environments. Quality without compromise and customer intimacy are their longheld focus, and Xebians are at the heart of it all. People come first, and it shows from projects assignments to knowledge sharing, with the emphasis put on the monthly organisation of Xebia Knowledge Exchange days to ensure knowledge transfer among Xebians.
The Challenge: A turnkey centralisation solution
One of their clients, an online bank, had their showcase websites organized around old infra legacy java code. Slow updates were following 3 months release cycles. Xebia redesigned the whole system in Node.js with auto scaling groups (from 2 to 10 servers) to manage load, together with many Lambda, Cloudfront, Beanstalk, and Nginx building blocks. The whole structure is hosted on AWS.
With all the different parts of the system to account for, a log centralization tool was necessary. Xebia -which was in charge of the project – did not want to build everything from scratch, as it often means maintainability and obsolescence issues, plus their client did not want to spend time and resources on what’s not their core know how.
1. The Solution: Cross layers tracking of a multi bricks infra
Having heard of Logmatic.io from a colleague, and being interested by the no maintenance side of the tool, Xebians in charge of the project decided to give it a look. Creating an account, they streamed all logs into it and were pretty soon building dashboards.
“There’s no need for a sprint in order to build an Elasticsearch cluster and collect logs.
In 15 min we had our first dashboards.” Jérémy Pinsolle
Xebians went on and displayed dashboards mixing technology as well as business KPIs to their client, who got very interested. So Logmatic.io dashboards were put up on a TV and tech teams could directly spot error spikes.
Correlations were drawn between app layers and Nginx servers. They already knew that some requests were being lost because of human feedback, high Apdex or New Relic data. Requests they sent were lost, but they previously could not identify what was happening. Now they were able to check for it in each layer, one after the other: the request would be found in the CDN, then in Nginx, but not in the app for example. Logmatic.io was crucial to identify in which layer something was happening to their lost requests.
2. Beyond the Solution: Improved conversations across business units
Logmatic.io allowed the online banking team to properly monitor the API being used for dev environments, with graphs for each endpoints. It really put into undeniable perspective what was mainly a gut feeling previously: this API, which was managed by their parent company, was performing poorly and putting a daily strain on developers work.
“With proper data presented to back up discussions in their internal management meetings, they were able to finally have the parent company setup more powerful servers for dev environments.” Jérémy Pinsolle
To further improve performance of the API in production, ElastiCache / Redis was implemented. Xebians then sent the proper metrics to Logmatic.io to monitor its behaviour with cache retention rate, proportion of stale data, errors and cache performance.
Are you facing the same challenges? We would love to talk. Drop us a line @logmatic.