Case Studies

ScalableDataTech > Case Studies


The number of loans falling into NPA (Non-performing asset) are increasing year on year which will increase the credit defaults and affect the profitability of the organization. The customer avails the loan facility from loan providers i.e. banking & financial organizations and when repayment of interest and Principal is overdue, such asset is classified as NPA. Banking and Financial sectors are facing serious challenges with the increase in percentage of defaulters and NPA year after year.

Solution Approach:

To reduce the NPA’s and defaulters SDT has taken the following approach and built a predictive model which could predict well in advance, the profiles which may become defaulter or NPA.

How SDT has achieved the solution?

The following datasets were used to build a predictive model to find defaulters and NPA’s.

  1. Loan Profile
  2. Recovery Data
  3. Actions Data
    • The loan profile contains borrower and co-borrower information with attributes like industry, designation, income, age group and location.
    • The recovery data contains the details current outstanding loan amount and outstanding months and the accounts which are falling into defaulter and NPA categories.
    • The actions data includes various events like visit, E-mail, Tele-call and SMS which contains various actions taken on a loan profile in case they have become NPA or defaulter.
    • Based on the above data sets , a model has been built which will categorize the details of the customers who will fall into NPA, Non-NPA and the profiles who have the propensity to become NPA.


Analytics Outcome :

Based on the scores generated for the loan profiles the categorization of NPA and Non-NPA is done, this allowed the organization to track the predicted NPA’s well in advance which reduced the NPA percentage significantly in the first quarter after implementing the solution.
From the predicted NPA and defaulters the following patterns and reports were generated.

  • From which location there are most number of defaulters?
  • The designation of the defaulters.
  • For which income range there are more number of defaulters?
  • From which industry there are more number of defaulters?


Reduce Churn Rate through Free Text Analytics

Problem Statement:
There is a high churn rate of 23% from existing customer base of the service providers.

Solution Approach :

  • There are 1400 tables available in the database of Australian Electricity Provider.
  • But the reason for churn rate could not be traced out in the available data.


How did SDT achieve the solution?

  • Inorder to find the reason and decrease the churn rate, the Free text analytics is used as an approach which analyzes the Free text provided by the customer care executives.
  • The free text comments were categorized into two buckets i.e. Service bucket and Price bucket.
  • 45% of data was falling into both the Service and Price buckets after analyzing the data which will help in reducing the churn rate.


A Customer Activity sheet is maintained and all the activities of the customers will be recorded in the sheet.

After removing all the stop words, the first and last lines used by the customer from the customer activity sheet, the actual issue is identified like what made the customer first became negative and group all the data.


  • The major issue of the churn rate is because of the hidden charges which are not known by the customer like if the customer migrates to new house there are hidden charges and also the bill gets generated for each quarter, though the customer is not available for a month or more than that he will liable to pay the amount for which the bill got generated.
  • By using this approach, a Sentiment Change Point is identified in the free text comments, the above issue can be resolved by sending the bills monthly and make the customer understand the details of the charges.
  • The sentiment change point actually identifies where the customer was dissatisfied and helps in decreasing the churn rate.

Analytics Outcome :

Reduced churn rate by 8% within first 6 months after implementing the solution.

Our Approach

The health care company experts in providing Infusion services and nourishment services.
These health care companies would be treating approximately 700K patients. The health care companies provide service to patients and they generates the invoices and sends that invoices to the payors.
The payors will validate the invoice and if they do not find any problem with the invoice and they pay the amount to the provider else if there is any problem the invoice would be denied and the invoice would go to the particular collection.
The major issue with the above process is the number of cases falling into collection phase was increased which is decreasing the revenue which is a serious problem to the company.
The company has the requirement to understand their collection efficiency.
How may invoices are going to the collection process?
In how many days the company get the maximum amount of collection after generating invoice.
Drill down the reasons for it and bucket which all insurance companies pay properly and which insurance companies do not.
Which insurance companies do not pay for which kind of products and for which specific diseases?
Perform cross analytics on the attributes.
The data from Excel, salesforce and SQLServer is ingested to Hadoop using open source components like Scoop and Flume.
Out of 11 million records, 700K records were identified as the actual data on analysing.
Later these 700K records are bucketed into 3 different categories like transaction data activity data
The transaction data has information with respect to the transactions like customer name what is the amount the customer has paid and the amount he has to pay.
The activity data contain historic versioning information like when exactly the invoice got generated, the first time it was sent to the payor, whether it was denied or partial denied or the payment that was made by the payor.

  • For the activity data set, a dashboard which shows cash collection efficiency has been created.
  • Calculated the collection cycles efficiency by identifying in how many cycles we are able to get the amount.
  • Identified the outlier’s i.e for which month’s collection varied to a huge extent and the reasons for it.
  • Finding that maximum invoices from which payor go into collection process
  • For which products the organization do not get paid.
  • Identified that some of the insurance companies do not pay 100 % for repeated request of a specific product within a timeframe.
  • Identified that in most of the cases improper submission of documents leads to denials from providers.
  • In this quarter they are able to improve their collection efficiency by 8%.


Perform Analytics on the unstructured data by transforming it into structured data.

Our Approach:

In most of the organizations there are tons of unstructured data/logs kept aside and they are completely wasted even though more amount of information can be obtained through the unstructured logs. And most of the organizations kept the logs aside as the logs were complex and required much effort to extract information from them.

To obtain quality information, analytics can be performed on top of merged data set of structured and unstructured data as depicted above but both the data sets are far from similar.

How SDT has achieved it?

Our solution is to ingest the unstructured data set into Hadoop by performing few transformation on top of it to convert the unstructured data into structured data.

Hence the transformed data set is merged with the existing structured data set which forms the complete data set on top of which analytics can be performed to Obtained the required information from the complete data set.


  • Structured data is not parsed but extracted from unstructured sets by applying complex filters which captures the accurate information.
  • As this is complex data there is in need to apply some regular expression in creative ways to capture the gist of the content from the plain data.
  • Also for accurately capturing the data from the unstructured set custom UDFs are to be applied on top of them to get quality information.

To increase the revenue within a span of 1 month to the marketers on ecommerce sites by providing the efficient solution by click stream analytics.

Our Approach:

  • Click Stream Analytics is a powerful application big data analytics which is basically to see the web pages visited by web visitors and in what order. This data will be collected and analysed by the businesses.
  • Through this analytics the traffic of the site can be identified and the web marketers can look into the user experience and how many products did the user check, the fastness or slowness of the page load and how often the user is clicking back or cancel button on their browser.
  • Our approach is all about sending recommendations to the user based on his browsing history on ecommerce sites. The process is initially started by identifying abandoned items in the cart of a consumer/end user.
  • The data will be ingested from Adobe Omniture into Hadoop. Later bucket the items into different categories.
  • Based on the category to which the product belongs to our recommendation engine used to send the ads/offers in that category to the end user.Using this analytics, marketers can quantify getting an idea how effective the sales are produced in the site.
  • It also shows that what pages users are more likely to visit, the items that are placed or removed from the card and the purchased items.
  • Implementation of the solution lead to increase in revenue within a span of 1 month.
  • The business people or marketers can make changes to the site to reduce the bounce rates and increase sales.
  • The advantage is that the behaviour of the customer is identified when they visit the website and by analysing can be important staying competitive.
  • Also it can be predicted whether the customer is likely to purchase from the website or not.
  • This provides information how the users will interact with the site which will help the further development of the site.


To increase the amount of revenue which is profitable to the marketers.

Our Approach:

The Market basket analytics will give the business people better understanding of the behaviour of customer purchasing.
The customers generally take different paths and most of them buy same product.
And hence this helps to find out what are the common interests of the customers and the common paths they take to purchase the specific purchase.
If the customer buys some products, it is more or less likely to buy the other products or items.
This analytics is all about generating recommendations/Combo offers for products
Here for this particular use case we have applied the association rules between various products based on historical transactional data.
An association model is built which explains how items or events are associated with each other.
Using this association model we found the items that tend to be purchased together which was later used for recommendation purpose.
This solution increased significant amount of revenue.
Knowing what products the customer is purchasing can be helpful to the business people or the retailer.
This also helps to determine the new products that can be offered to the prior customers and pairing of the products that are commonly purchased.
It can provide the understanding of sales pattern and also helps the company to manage inventory and to resolve the problems which are blocking the company to reach their goal.
Blind Marketing can be eliminated and basically these benefits will surely increase the sales and also increases the profit margin using these results.
The company and consumer will be benefitted and growth in retail.


Identify the customers who have the potential to invest more than the existing invested amount of the same customer in the investments which will enable the organization to achieve the growth.

Our Approach:

The initial approach is to do Customer profiling by gathering a list of the customers which includes Customer name, location, contact information, income.

Once the customer information is gathered the next step is to do Investment Profiling which includes the investment information as listed below:

  1. Details of the investment made by the customer.
  2. In which portfolio the customer has invested the amount?
  3. How much amount the customer has invested in the portfolio?

How the solution is achieved by SDT?

The solution is achieved by Cross Bucket Analysis done between Customer Profiling and Investment Profiling.
Through our SDT’s approach, we have created a model which will allow us to identify the target customers who can increase the investment in the firm.

How it is done?

  • The organization maintains a track of customer details.
  • Generally, the customer do invest some amount in some investments.
  • The organization cannot predict the potential of the customer based on the existing investments.
  • In the first step, the Customer profiling and Investment Profiling will be merged.
  • After merging the data from customer profiling and investment profiling, the details of the customers who has more potential to invest will be identified.
  • On merging the customer profiling data and investment profiling data, the patterns obtained from the merged data can be bucketed.
  • On the basis of obtained patterns, the potential of the customer can be predicted through the bucketed pattern to which pattern the customer belongs to.
  • Even though the customer is capable enough of investing more amount but has invested a lesser amount, organization can predict and increase the sales through the above obtained predicted patterns.
  • This approach is more profitable, reliable and effective because this further increases the sales only through the existing customers.


The PoS (Point of Sale) data is dumped into the big data hadoop platform and perform analytics on which the Reporting and Forecasting is done.

Our Approach

  • The PoS data includes the details of the products purchased by the customer and can also track the customer orders and the frequently purchased products.
  • PoS data records each & every sale and our approach rapidly assess business & performance while identifying the trends.
  • Analyze the transactional Pos data like how well the items were sold and which will help the organization to decide the seasonal purchasing trends.
  • The insight includes in fetching the sales information and track the sales which are not guesses.

The attributes of the PoS data usually includes the below information :

  • Invoice number
  • Product_SKU id
  • Description of product
  • The quantity purchased by the customer
  • Invoice date
  • Price for single quantity
  • Customer id
  • Country
  • Whole price for multiple products
  • The hour in which the product is purchased.

Performing analytics on PoS transaction data will allow us to do following Reporting and Forecasting on Business Intelligence end to take better decision.

SDT has recently performed Analytics on PoS Data was performed on India’s largest clothing and apparel company and provided better insights for growth.

Reporting :

Our SDT has generated below reports using the Big Data analytics which is performed on PoS Machine data :

  • Identify the peak transactions on hourly basis in a quarter of the day.
  • Identify top revenue generating products sold.
  • Identify the top revenue generated on the transactions on hourly basis.
  • Identify the top revenue generating products across various branches, countries and regions.
  • Identify the hourly sales of products in various branches, countries and regions so that the customer visitng at the peak time and maintain the staff to manage the customers accordingly can be identified.
  • Product usually sold per hour in a quarter of the day and identify the quarter of the day usually serious customers visit the branch.


How the solution is achieved by SDT ??

The solution is the PoS data is dumped into the big data hadoop platform and by performing analytics on which the reporting and forecasting is done.


The following key points are achieved on performing the analytics for the report generated :-

  • To maintain the same consistent top revenue for the new products across the various branches, countires and region who are already generating top revenue products.
  • Identify the particular branch where the sales are not happening on analyzing the branches by region by country.


How it is done ?
Sales Forecasting :

The sales forecasting is nothing but predicting the sales based on the previous month sales flow for the upcoming day, month or year by machine learning.

How sales forecasting would benefit the company??

  • How many peak transactions could be done on hourly basis in a quarter day?
  • What could be the top revenue generating products that can be sold?
  • How much could be the top revenue generated for the transactions made?
  • What could be the top revenue generating products that are sold across various countries?
  • What could be the hourly sales of products across various branches by regions by countries?
  • What could be the region where the sales may the hit high and where the sales may fall down and lists the products?


The analytics for the report generated is thus met by analyzing the inventories and helps the business team to estimate the sales and meet the profit line.

By properly collecting these data and reporting them will bring success to the retailers and efficient retail operation.

SDT’s PoS Data on Analytics is the solution which provides the apt information which will build relationship with the customers and key success to the retailers which will increase the profit margin.


To perform product specific Analytics and understanding good/bad part of the product based on opinion from social media. Identify the areas on which we need to focus for branding on TV/FB/Twitter and understand Customer Pain Points.

Our Approach:

Cross product Analytics to highlight good parts of our product and bad part of competitor based products.

By using this approach merchandise will get to know the below information like:-

  • How many customers switched from their products to other products?
  • The reason behind switching to other products.
  • How many people have switched from other products to their product?
  • Identify the reasons why people are switching from other products to our product so that we can highlight the same in our ad-campaign.
  • Sell same cross product Analytics to other vendors.

Based on our approach, Social Media analytics will be used to meet the objectives and unlock valuable insight available on social platform.

  • Initially the data from 20-25 sources like fb, twitter, yelp, brb, consumer affairs will be scraped and bring the data into common format.
  • Once the data is brought into common format, the data will be cleaned and merge the data sets scraped from different platforms in Hadoop.
  • The Data Science Algorithms is applied for Topic Modelling like Mallet and once topic modelling is done, topics/containers will be categorized.
  • The analytics will be performed on those topics and perform Sentiment Analytics on the topics.
  • Based on the Sentiment output /polarity, it will be categorized as good/bad.
  • The topics will be further classified where we observe extreme polarity.
  • The reason for making the customer happy will also be identified.
  • Cross verify with the product company whether they already know about good things and are these topics part of their campaign/branding.
  • Inside Bad Topics the reasons which are making things worse and leading to increase in customer churn rate will be identified.
  • The product specific company can focus on these issues and reduce churn rate and gives the organization edge over the competition.

About product:

  • The documentation of voice analytics is all about the conversion of voice to text by recognizing the voice with accurate recognition.
  • It has the capability of recognizing bilingual speech.
  • The voice analyzer will give the visual representation as well as a pluggable library of pitch, emotions, sentiment recognition, and business keyword identification.
  • It identifies the positive & negative words, day & month wise sentiment analysis and the frequency of occurring business words.


Our product met the most of the requirements that are not available when compared to the similar products in the market. The features are listed below:

  • Bilingual support
  • Cloud Support
  • Support for open source technologies
  • Real time search type
  • Pluggable library available
  • Pitch recognition available
  • Picking Business words
  • Emotion detection
  • Cross analysis of emotion with pitch and sentiment
  • Merging with CRM
  • Merging with Social media
  • Connection to different databases
  • Support Stereo and mono type

Case studies:

  • This case study depicts the advantage, an organization can take with this product for customer retainment.
  • The information that customers enquire about the prepayments can be obtained by merging the voice product with CRM and this resultant information directly given by the customers can be used to retain more number of customers by taking appropriate measures and also can prevent occurrence of loss for any organization.
  • For any given appliance company it is crucial to keep an eye out for the information regarding the faulty pieces, the more of complaints registered in an area, the complaints for a specific product of theirs, lack of support from the agents etc.
  • Hence these all information can be obtained by merging our voice product with the company’s CRM and through the calls recorded not only the information which can be identified but also corrective measures can be taken in order to improve the product or the services


  • Our product dealt with a large number of challenges and met the requirements when compared to the similar products in the market like though the speaker’s voice is surrounded by noise it still makes the accurate recognition as it removes the background noise and it focuses on the speaker’s speech.
  • A speaker may speak a number of different words and all the words would be accurately recognized. Although the accent of speaking varies from person to person, this product would even recognize the various accents and will be converted to text accurately as this can recognize anyone’s voice.
  • The product has the fantastic feature of recognizing the multilingual usage of the regional languages. It analyzes the audio data, detecting things like emotion and stress in a customer’s voice, the reason for calling the customer care, the products mentioned and more.
  • If there is any operational and performance issues that occur throughout the enterprise can be tracked and managed which leads to improved service quality
  • The merging of voice with CRM can deliver the right service at the right time. This helps in building strong relationship with the profitable customers and also helps in identifying specific products and services that can benefit customers


  • Customer Retention: Through Voice analytics we can verify if right set of information is provided to the customer. If wrong information is provided to the customer it might affect the credibility of the organization.
  • Customer Satisfaction: Customer satisfaction can be measured by analyzing the sentiment of the call.
  • Inventory Management: The organization can proactively maintain or send the products to various regions based on the issues arising in those areas.
  • Sales Opportunity: If any particular crop is not supported and a lot of customers are enquiring about it, the organization can think about supporting those crops as well and can come up with new products for those crops.
  • Effectiveness of Call Center: we can analyze the sentiment of the calls and can judge which all employees are effectively handling the calls and can route the calls to those employees in case of repeated call from customer who expects some information.
  • We can also capture the following type of patterns from Voice Analytics:
    1. From which regions we usually get more number of calls and less number of calls.
      We can analyze why we are getting less number of calls from specific regions and take necessary actions to increase the marketing scope of the organization in that area.
    2. For which products we usually get more number of enquiries.
      This will help us to maintain the inventory accordingly and focus on marketing other products for which we get less number of enquiries.
    3. For which regions we get which kind of problems.
      This will allow us to proactively send the products to those areas based on the kind of problems.
    4. For which crop we usually get more number of enquiries from various regions.
      This will provide us the information like what kind of crops are cultivated in various regions and we can do marketing for various products accordingly in those areas.
    5. Overall what kind of enquiries do we get.
      We can classify a call based on whether enquiry was done regarding price or weather or crop or a product or a problem.
WordPress Lightbox