How we used NLP to optimize Argus Data Insights
We used natural language processing to help Argus develop an intuitive engine for detailed reports and rich insights.
Argus Data Insights helps companies access the media information they need to make better business decisions. The company uses a mix of technology and human input to monitor print, social media, TV and radio – giving detailed, actionable insights into the industry.
The challenge: finding the most relevant insights for clients
With millions of words published online every day, how do you filter down to the most relevant information for your business? Not just the stories you’re mentioned in - but the industry insights that will open up new opportunities for success. That’s the challenge Argus helps its clients solve – with rich, tailored insights.
The company wanted to improve the accuracy, relevance and speed of delivery for their insights product. To do that, they wanted a software partner to use the latest data science models to create an advanced solution for the future.
Results: An NLP-fuelled engine for detailed reports and rich insights
We used multiple natural language processing (NLP) models to dramatically improve the quality of the media insights Argus could offer its clients. We focused on four areas to improve:
- Sentiment analysis – how to detect the underlying tone of text (negative, positive, neutral or mixed)
- Named entity recognition – identifying people, organizations and locations from text, a crucial element of any report
- Target dominance – working out the focus of the content, to judge whether it’s a passing reference or a detailed analysis of the target business, person, or topic
- Language detection – to identify the language of the input text
Using language-agnostic NLP
At every stage of the project we worked closely with the team at Argus to iterate towards the best result. And although we were using models with complex architecture – like BERT, roBERTa – we made sure they were language-agnostic.
Microservices to improve speed
Alongside detail and accuracy, we improved the speed of the service too. We built every NLP model as an independent microservice with its own API – which makes them much faster to use. It also meant that each of Argus’ products could use the models independently, depending on their needs.
Working in German for multilingual solutions
Argus works primarily in German, so for this project we did too. It meant diving deep into the technical language of the space, and helped us work closely with Argus to deliver the best result. The solution we delivered is German-first, and we made sure some services support English too. And we created a service that detects the language of written text, to help improve the insights Argus produce.
Flexible compatibility and increased speed
We provided an API for each NLP model, so it could be used independently by any of the client’s products. In combination with Elasticsearch, this solution significantly reduces the time for generating final results.
Data Science Services
We are offering leading end-to-end data solutions that will help you make the best business decisions, improve user experience, and turn your big vision into reality. Your dreams. Our expertise. Together, we give you the strength to succeed.
Let's goThe tech we used for this project
We used a full suite of data science models to build a powerful solution for Argus. Here are the details:
- Standard Python for cleaning and tidying data –we used libraries like NumPy, Pandas, PyTorch, and NLTK to remove errors and combine complex data sets.
- NLP models from Hugging Face – an open-source platform provider of NLP technologies.
- JupyterHub for finetuning – an on-premises solution for fine tuning the client’s data.
- Different models for different tasks – depending on the task we aim to resolve, we used models with complex architecture like BERT, roBERTa, and more.
While machine learning was the core part of this project, we also used various other technologies to plug into the tools that data scientists use:
- FastAPI library - to create backend service
- Elasticsearch - used as a database
- React - frontend technology used for the demo application
- DevOps technologies - Microsoft Azure Cloud, Docker, Kubernetes