The Top Six Big Data Challenges in Education

education challenges

Top Big Data Challenges

The path to the successful application of Big Data to educational institutions is going to face at least six major Big Data challenges or road blocks that will have to be addressed one at a time:

Integration across institutional boundaries – K-12 schools are generally organized around academic disciplines. Universities are organized as separate schools, faculties, and departments. Each of these units operates somewhat independently of the others and share real estate as a matter of convenience. Integrating data across these organizational boundaries is going to be a major challenge. No organizational unit is going to surrender any part of its power base easily. Data is power.

Self-service analytics and data visualization –– It is going to be a piece of cake to give planners and decision makers the technology based tools they need to do their own analytics and visualize the results of their studies graphically. It is going to be a genuine challenge to create a culture that requires them to do their own studies using those tools. An even greater challenge will be to create a climate that informs their decision making with the results of their own studies because they are so accustomed to making decisions intuitively.

Privacy – There is a great deal of concern – perhaps even excessive concern – about the privacy of the information collected about each student and her family. The concern is that this data could fall into the wrong hands or be abused by those who have been given responsibility for safeguarding the information. To some extent, this is a technological and management issue. However, the fundamental issue is fear that the technical and management safeguards either won’t work or will be abused. Lisa Shaw, a parent in the New York City public school system said, “It’s really invasive. There’s no amount of monetary funds that could replace personal information that could be used to hurt or harm our children in the future.”

Correlation vs cause and effect– Purists in rational argument want to see arguments that clearly spell out cause-and-effect relationships before blessing them as a basis for decision making. The fact that two factors may be highly correlated does not satisfy this demand for cause-and-effect. Nevertheless, real world experience in other areas of Big Data have shown that high correlations are sufficient by themselves to make decisions that are either lucrative or achieve the objectives the players in mind. This means they have been able to realize significant benefits based on correlation without being able to argue the underlying mechanics.

Money Nearly all educational institutions are strapped for money. When they make decisions to invest in the hardware, software, staff, and training to exploit Big Data, they are making decisions not to hire another professor, equip a student lab, or expand an existing building. That can be a tough call.

Numbers game Some argue – perhaps rightfully so – that Big Data reduces interactions with students to a numbers game. Recommendations and assessments are based entirely on analytics. This means that compassion, personal bonding, and an understanding of the unique circumstances of every student gets lost in the mix. Others argue that Big Data is an assist to the human process. In any event, this is unquestionably a stumbling block.

Privacy vs. Evidence Based Research

There is a great deal of concern about student privacy as we mentioned above, and it is one of the top Big Data challenges that must be resolved. One of the key reasons for this concern focuses on the process of growing up itself. It’s not unusual for students to participate in activist organizations in their youth that they reject later in life. Or they drank too much in university but sobered up once they had the responsibilities of jobs and families. Or a teacher may have given a student a negative evaluation that should not have survived his graduation or departure from the school. In the past, we simply forgot these things. Life moves on and we don’t give a great deal of attention to what happened 25 years ago. But permanent records that can be pulled up and viewed decades later may cast shadows on job candidates that are completely unwarranted at that time. In other words, we lose the ability to forget.

There is an even greater threat, though. Although there is general agreement about the value of predictive analytics, no one pretends that the predictions are inevitable. Nevertheless, a computer-generated prediction can take on the aura of truth. A prediction that a student is not suitable for a particular line of work may prevent hiring managers from hiring her for a position she is perfectly well suited to handle. These predictions can severely limit her opportunities in life forever.

One way of dealing with this is to pass legislation that limits access to student information, protects the identity of individuals, and yet still makes it available to those conducting legitimate educational research. Unfortunately, this ideal is better served in rhetoric than in reality.

Consider stripping student information of any identifying information and releasing it, along with records of other students in the same cohort, for general access for educational research. Yes, the school has taken all the required and appropriate steps to protect the students’ identity. But, no, it doesn’t work. That’s because Big Data practitioners generally access large data sets from a wide variety of sources. Some of those other sources (viz. Facebook) make no attempt to protect the individual’s identity. Those secondary sources have enough unique identifying characteristics that can be accurately correlated with the de-identified school records to re-identify those school records. The best laid plan of mice and men …………

There is no shortage of legislation in the US to protect student information. The most relevant legislation includes:

  • The Family Educational Rights and Privacy Act of 1974 (FERPA). This act prohibits the unauthorised disclosure of educational records. FERPA applies to any school receiving federal funds and levies financial penalties for non-compliance.
  • The Protection of Pupil Rights Amendment (PPRA) of 1978. This act regulates the administration of surveys soliciting specific categories of information. It imposes certain requirements regarding the collection and use of student information for marketing purposes.
  • The Children’s Online Privacy Protection Act of 1998 (COPPA). This act applies specifically to online service providers that have direct or actual knowledge of users under 13 and collect information online.

Unfortunately, this legislation is outdated and somewhat useless today. For example, it applies to schools but not to third party companies operating under contract to the schools. This legislation was enacted before the era of Big Data and doesn’t address the issues that this current technology raises. Further, the acts don’t include a “right of action.” This means there is no way to enforce the law.

In light of this, there are ongoing legislative attempts to deal with the need to protect the privacy of student information. Up until September 2015, 46 states introduced 162 laws dealing with student privacy; 28 of those pieces of legislation have been enacted in 15 states. There have been ongoing initiatives at the federal level as well. Relevant pieces of federal legislation that have been introduced include:

  • Student Digital Privacy and Parental Rights Act (SDPPRA)
  • Protecting Student Privacy Act (PSPA)
  • Student Privacy Protection Act (SPPA)

These acts are primarily concerned with protecting student data that schools pass along to third party, private sector companies for processing. In spite of the fact that these companies have generally built in their own data protection policies and procedures that already meet the requirements of this legislation, there is still considerable fear that the companies will use the data for nefarious purposes such as tailoring marketing messages to particular students – something that is clearly outside the scope of providing education or conducting educationally related research.

The US is not alone in its concern. The European Union has developed regulations that apply throughout the EU. This is in contrast to the fragmented American approach. To be fair to the Americans, however, the Constitution specifically provides that education is a state concern, not a federal one.

The EU 1995 Directive 95/46/EC is the most important EU legal instrument regarding personal data protection of individuals. Rather than discourage the use of third parties storing and processing student information, the EU prefers to regulate it. The EU recognizes that private sector companies provide a valuable service.

The Directive gives parents the option of opting out data sharing arrangements for their children. However, doing so would likely jeopardize the educational opportunities their children would enjoy otherwise. In other words, while parents have the right to opt out, it would be imprudent in practice to do so.

After considerable discussion and consultation, the EU Parliament approved the General Data Protection Regulation (GDPR or Regulation). This Regulation is set to go into effect in May 2018.This Regulation pays particular attention to requiring schools to communicate “in a concise, transparent, intelligible and easily accessible form, using clear and plain language, in particular for any information addressed specifically to a child.”

Unfortunately, this is problematical. Big Data and Machine Learning develop algorithms that are quite opaque. Even the professionals who operate Big Data systems don’t know the inner workings of the algorithms their systems develop. Interestingly, they don’t even know which pieces of input are pivotal to the output and recommendations of those systems. In this context, it is reasonable that the general public sees EdTech companies as a threat to students’ autonomy, liberty, freedom of thought, equality and opportunity.

On the other hand, when you visit these EdTech websites, it certainly appears that they are driven by a sense of enlightenment. Their websites clearly suggest that they have the best interests of the students and their client schools in mind. Aside from the opaque nature of Big Data and Machine Learning algorithms, it is not clear – to this author at least – that EdTech companies deserve to be treated as skeptically as they are. It’s quite possible that the nub of the issue is not the stated objectives and current operations of these companies, but rather the uses that this data might be put to in the future and have not been foreseen today. In other words, the way the data might be used in the future is unpredictable. The unpredictable uses of the data could lead to unintended consequences.

In both Europe and the US, when we look at the furor about the importance of the privacy of student information, it often boils down to pedagogical issues.

Here is the nub of the conundrum in a nutshell. There is clearly a potential benefit of conducting educational research using student information. There is good reason to believe that tracking students over the course of their academic years – and perhaps even into their working careers – would allow scholars to identify early indicators of eventual success or failure. However, if scholars are prohibited from conducting that research by placing restrictions on student identification or restrictions on the length of time data can be stored, then that sort of research could not be conducted. This could conceivably lead to a loss of value to both individual students who could benefit from counseling informed by reliable research as well as to benefits to society at large.

How Is the Future of Big Data in Education Likely to Unfold?

Here are the trends to look for – in no particular order. These trends are instrumental in informing the schools’ policy development, strategic planning, tactical operations, and resource allocation, and overcoming the Big Data challenges in Education.

Focus student recruitment – Historically, colleges and universities have had student recruitment programs that were fairly broad in terms of geography and demographics. This led to a large number of student applications for admission. Unfortunately, many of the students the institutions accepted did not enrol in those schools. Colleges are now using Big Data to find those geographic areas and demographics where their promotional efforts not only generate large numbers of high caliber applicants, but also applicants who, if accepted into the college, will actually enrol.

Student retention and graduation Universities need to do more than attract high caliber students. They need to attract students who will stay in school and graduate. Big Data coupled with Machine Learning can help identify those students. In parallel with student recruitment, the schools will increasingly use Big Data to identify at risk students at the moment they show signs of falling behind. This will enable the schools to assist the students, help ensure their success, retain them in school, and increase the chances they will graduate.

Construction planning and facility upgrades Educational institutions at all levels have more demands to add or expand their buildings and upgrade their facilities than their budgets will permit. They need to establish priorities. Big Data will help planners sort through the data to identify those areas that are likely to be in highest demand and provide the greatest benefit to the students and the institutions.

Data centralization At the moment, nearly all data in educational institutions is held in organizational silos. That means that each department or organizational unit collects, stores, and manages the data it needs for its own purposes. That is a natural result of the need for each function to get its work done. However, it is counterproductive if we wish to apply Big Data. In the future, we can expect these siloed data stores to be integrated or linked virtually. Integration means that the data will be moved to a central repository and managed by a central function – like the IT department. Virtual integration means that the functional units will remain where they are at the moment but the IT department will have read access to each of these repositories. Quite likely, we will see both options in practice for the foreseeable future.

Data based decision making and planning Although Education has enjoyed the benefit of quantitative studies for centuries, the practice of education is generally driven by the philosophical views of educators more than data or evidence based studies. In fact, this approach has been enshrined in our commitment to academic freedom at the university level and has trickled down, to some extent, to public and private K-12 schools. Big Data will enable a data-rich culture that will inform policy development and operational planning to an extent we’ve never seen in the past.

Greater use of predictive analytics Machine Learning applied to Big Data will become increasingly successful at predicting students’ future success based on their past performance. Schools of all stripes will rely on these predictive analytics more and more in the future. This is likely to lead to two types of outcomes. On the one hand, schools will allocate more resources to those students most likely to succeed and, as a result, graduate more high-performing students who will deliver significant benefits to their communities and the world. On the other hand, predictive analytics will restrict the academic opportunities of failing students or those who show little promise – like Albert Einstein. Predictive analytics will also help institutions develop counter-intuitive insights that will challenge long cherished values and lead to better student and institutional results.

Local adoption of analytics tools Older readers will remember the days when word processing was handled by a pool of word processing typists. Over time, word processing migrated from the pool to executives’ assistants and, eventually, to the desks of the executives themselves. Once word processing reached the desks of the executives and other knowledge workers, word processing shifted from being a mechanical function to being a creative one. Knowledge workers crafted their messages as they took form on their screens. The same will be true of predictive analytics. We are going to see the hands-on management of predictive analytics studies migrate from Big Data specialists to the desktops (and laptops) of executives who need to think through, propose, and defend policy statements, strategic plans, and operational or tactical initiatives.

User experience – Educators often don’t know a student is having a problem until they see the student failing (or just barely passing) quizzes and tests. But, even when they recognize the problem, they don’t know the reasons any given student is falling behind. Big Data will help students by recognizing the problems they have as those problems occur. Then it can offer tutorials that address those problems as they occur – not days or weeks later when it may be too late to affect the students’ learning trajectories.

Real time quiz evaluations and corrective action. — As computers and tablets become ever more pervasive in classrooms, schools at all levels will be better able to collect digital breadcrumbs about how students perform on quizzes and determine what corrective action is required. This is going to eventually become the norm. Seven Ross, a professor at the Center for Research and Reform in Education at Johns Hopkins University agrees. He said, “Most of us in research and education policy think that for today’s and tomorrow’s generation of kids, it’s probably the only way.”

Privacy, privacy, privacy The privacy of student and family data will continue to be a hot issue. Over time, however, the benefits of sharing data with student identification data will outweigh the concerns of the general public. Sharing this data among qualified research professionals will become more socially acceptable not only as technological safeguards are put into place, but as they are accepted as being appropriate. In practice, society will discover that the student data they thought was secure, is not. Witness the data breach at Equifax that spilled confidential data about 143 million people. Do you remember the data breaches at Target and Home Depot? Again, tens of millions of people who trusted these companies with their credit card information were affected.

Learning Analytics and Educational Data Mining – We are seeing a new professional discipline emerge. The professionals in this field will have both the professional and technical skills to sort through the masses of unstructured educational data being collected on a wholesale basis, know what questions to ask, and then drill through the data to find useful, defensible insights that make a genuine difference in the field of Education. The demand for these specialists is likely to outstrip the supply for many years to come.

Games We are likely to see far more games introduced into the educational curriculum than we’ve ever seen before. Games are not only proven to be instrumental in the learning process, they also lend themselves to data acquisition for immediate or later analyses.

Flipped classrooms The Kahn Academy has reversed the historical process of delivering course material during class time and assigning homework to be handled out of class. It their flipped classrooms, students watch streaming videos at their leisure out of class. Class time is dedicated to providing students a forum where they can work through their problem sets and ask for – and get – help as they need it. This flipped classroom is going to become far more widespread because our technologies today enable it – and it just makes a lot of sense.

Adaptation on steroids Adaptation is nothing new. It’s been going on for thousands of years. The idea is that course material or explanations or problem sets or tutoring is tailored to the individual needs of the student. But when we put that adaptation on steroids, we see a shift in “kind.” In other words, we see something that was not present before. Today we can monitor every move students make, not just count the right and wrong answers they give to a quiz question. By analyzing facial expressions, delays in responding, and a myriad of other variables, we can tailor make and deliver a tutorial specifically suited to a student’s learning problem at the moment the problem occurs.

Institutional evaluation Schools have always presumed to grade their students. Until relatively recently, it was presumptuous for students to grade their teachers or their schools. Now it is becoming common practice. In fact, Big Data will play an ever-growing role in assessing the performance of individual instructors. More importantly, Big Data will rank order universities, colleges, and high schools on a wide range of variables that can be supported through empirical evidence. True, some of that evaluation will be based on “sentiment” – but much of it will be based on hard analytics that would have been too time consuming or too expensive to collect and analyze in a holistic manner.

The Jury Is Still Out

In spite of all the investment, the excitement, and the promise of Big Data in Education, we still don’t have enough experience to make categorical claims about its value. We are still struggling the top Big Data challenges we face.

In an article in The Washington Post last year, Sahlberg and Hasak claimed that the promised benefits of Big Data have not been delivered. As a visiting professor at The Harvard Graduate School of Education, Sahlberg is an authority we should listen to. He claims that our preoccupation with test results reveal nothing about the emotions and relationships that are pivotal in the learning process.   Our commitment to judging teachers by their students’ test scores has the effect of steering top performing teachers away from low performing schools – exactly where they are most needed. There are extensive efforts to evaluate both teachers and students. However, according to Sahlberg, this has NOT led to any improvement in teaching in the US.

The most that Big Data can offer is an indication of a high correlation between one factor and another. It cannot tell about cause and effect. In fact, cause and effect argments are difficult for people to make – and yet they are instrumental in building compelling arguments. Having said that, it is revealing to recognize that finding high correlations in other fields – even without a demonstrated cause and effect relationship – have proven to be quite beneficial.

Big Data is Transforming the Food and Beverage Industry

food and beverage and big data

A 2015 McKinsey study reported that food retailers can improve their operating margins by up to 60% simply by harnessing the power of Big Data. In order to keep pace with consumers’ fickle buying habits, food and beverage companies need to begin combining raw point-of-sale data with the Big Data that is now available. Analytical capabilities then can transform this data into meaningful intelligence that can inform management decisions. Those decisions will boost sales and improve their overall bottom-line performance. For example, food and beverage retailers, suppliers, and trading partners can share Big Data to ensure they offer the right products, in the right quantities, in individual stores and online.

Big Data Helps Drive In-Store Revenues

Food and beverage companies can use Big Data to increase traffic to their brick and mortar stores. The GPS location capabilities of most mobile phones provide a channel for retailers to display “pop-up” promotional messages that are highly relevant to an individual’s specific location and past purchasing history. A shopper, for example, standing in a frozen food aisle can receive a text offering a discount for a certain ice cream flavor nearby that she has bought in the past.

Big Data Helps Schedule Food Deliveries

Big Data can optimize on-time deliveries of orders to restaurants, food chains, and home customers. Big Data will collect recent information from various sources about road traffic, weather, temperature, routes, etc. and provide an accurate estimate of the orders’ times. This data analysis helps ensure that food/beverage companies don’t waste their resources transporting stale products. They will deliver perishable food items when they are fresh.

Big Data Helps Allocate Food and Beverage Across the Country

By using Big Data to track purchasing decisions from wholesalers down to the customer level, food and beverage companies can learn what products are being purchased and where. For example, a company might learn that customers in the Pacific Northwest are purchasing 15% more of a diet beverage than the nationwide average. Further, they may learn that the Midwest is purchasing 15% less of that same beverage. This knowledge allows the company to know to ship more of the diet product to the Pacific Northwest and less to the Midwest.

Big Data Helps Maintain Consistent Food and Beverage Quality

Big Data allows restaurants to maintain consistent quality of their products. Consumers expect the same taste in food at the chain restaurants they love. The taste of food not only depends upon the proper measurement of ingredients, but also on their quality, storage, and season. Big Data analytics can analyze such changes and predict the impact of each on the food quality and taste. The insights from these analyses will be used to identify pain points and suggest measures for improvement.

Big Data Analyzes Customer Sentiment

Big Data can analyze customer sentiment by monitoring customer emotions expressed on social media networks. Food companies use sentiment analysis to track their customers’ emotions. They can assess negative reviews and take appropriate preventive steps before the word spreads. Large food retailers like McDonald’s, KFC, and Pizza Hut have found this particularly valuable.

Big Data Has a Good Idea What Customers Will Purchase Next

Food and beverage companies use Big Data for “market basket analysis.” Market basket analysis is a technique which predicts the most obvious item that a customer is likely to purchase next based on her purchase history and the items already in her cart. Food retailers and restaurants use these projections to create effective combo deals and improve their marketing messages. For example, if the market basket analysis identifies that a customer prefers a muffin with her coffee, then it can create a combo to help her enjoy them together.

Big Data in the Home Improvement Industry


The Home Depot is the unchallenged leader in the home improvement retail sector in terms of applying Big Data to advantage. The Home Depot collects data from its own website, promotional emails, and social media. It uses that information to drive traffic to its stores by improving their marketing programs. As a result, The Home Depot is beating investors’ expectations and is described as “Amazon-proof.” Interestingly, this is happening at the same time many retailers are struggling to connect with their customers and deliver meaningful results to their investors.

The Home Depot Will Spend $4 Billion Over Three Years on Big Data

The Home Depot is spending roughly $4 billion from 2016-2018 to improve the company’s e-commerce platform and physical stores and bolster the link between the two. The Home Depot is creating a system that allows customers to easily order what they need online, have their employees collect these orders in store, and let their customers drop by the stores to pick up their purchases in moments. This buy-online, pickup in-store (BOPIS) model has proven vital for The Home Depot. According to the 2016 Internet Retailer’s report, about 25% of The Home Depot’s $3.76 billion in total website sales, or nearly $1 billion, came exclusively from their BOPIS program.

The Home Depot Uses Big Data to Reconfigure Its Supply Chain

Using technology and Big Data to rethink The Home Depot’s supply chain has been a key part of the company’s success within the home improvement industry. It will prove increasingly vital as the company moves forward. The Home Depot has used Big Data to improve its supply chain several ways:

  • Dynamic ETA, which gives customers delivery data and delivery estimates based on their exact location.
  • Sync is a multi-year project that will reduce shipping and inventory costs through better coordination between stores and distribution centers.
  • Its Customer Order Management System helps balance store and web inventories. It also enables buy-online, pickup in-store customers to choose the store with the shortest wait time for pick up rather than requiring customers to choose stores only by location.
  • An easy-to-use website and mobile shopping platform will make the customer experience more seamless while allowing The Home Depot to better collect customer data. It will use that data to further improve its Big Data initiatives

Wayfair Is a Winner in Home Improvement Due to Big Data

Home goods e-commerce company Wayfair was created in the digital ecosystem of 2002. Since then it has thrived due to its consistent commitment to and use of Big Data. In 2016, Wayfair introduced a search with photo capability. This capability taps into Wayfair’s Computer Vision System. This Vision System is based on the company’s own machine learning techniques and its massive proprietary data sets. This system allows customers to upload images of furniture they are looking for and Wayfair will give customers search results that match the image provided as closely as possible.

This data collected from this visual search feature creates a powerful feedback loop, which makes Wayfair’s results more useful for customers in the home improvement industry. Wayfair measures the impact of its photo search system by tracking the number of loyal repeat customers. In the second quarter of 2017, the numbers of orders per customer and repeat customers both increased year-over-year. The number of repeat customers grew to be more than 61% of total orders in the second quarter of 2017; this compares well to the 58% in the second quarter of 2016. Repeat customers placed 2.6 million orders in the second quarter of 2017, an increase of 55% year-over-year. These increases in repeat customers and their orders is testament to the effectiveness of the Big Data driven photo search capability.

Big Data is everywhere in the Retail Industry

Retail and big data

Big Data is everywhere in the Retail Industry. It would be be hard to find any part of the management of retail operations that is not deeply touched by Big Data. In fact, it is already clear that to survive in the Amazon era, all retailers will have to rely heavily on Big data to help them store the right merchandise, at the right times, in the right quantity, and at the right price. Those that ignore it will die.

This does not mean that only the large companies that have the resources to exploit Big Data will thrive while small companies will die. Small companies will be able to harness the power of Big Data – but they are likely to do so through niche consulting firms that have developed the professional and technical skills, hired a stable of experts, built the computing platforms, and acquired access to the massive data required to operate effectively in this field. Smaller retail outfits that buy their services will find them expensive – but the benefits should far outweigh those costs.

The biggest costs will likely not appear on the company’s financial ledger. The biggest costs will be the time, focus, and energy that senior and middle management will need to invest to come to grips with how to leverage Big Data and how to build investment arguments that make sense. This last point proved to be a major stumbling block when general data processing began to make inroads into large and then medium sized companies some 40 years ago. It is quite liable to prove to be a stumbling block in the application of Big Data as well.

Online and Store General Merchandisers

With regards to department stores, the most promising use of Big Data is through recommendation engines.

Recommendation engines — Recommendation engines use the historical purchasing decisions of customers to predict future purchases and recommend other products to customers that they may be interested in. Big Data using these engines have the potential to generate accurate product recommendations to customers before they even leave the webpage. Amazon, for example, sees a 30%-60% revenue uplift due to these recommendations alone. These recommendation engines are a widely-used way of incorporating Big Data into department stores because they are easy to implement and have an immediate positive impact on revenue: recommendation engines are shown to have the potential to boost revenue by 24% on average.

Trend Forecasting — The second biggest use of Big Data in department stores is predicting trends and forecasting demand. Trend forecasting algorithms comb social media and web browsing habits to find what products and services are causing buzz. These algorithms also analyze ads to see what products marketing departments are pushing. The algorithms then compare the data gathered from social media with the data gathered from current ads to accurately predict what the top selling products for a given quarter will be, how to better market products, and how to develop more cost-effective marketing strategies.

These predictive algorithms assist retailers in making better informed decisions about stocking and product ordering. This capability is particularly helpful during the holiday season when shopping rates increase – machine learning can use past historical shopping data to forecast future purchasing and revenue outcomes. It is anticipated that this kind of predictive analysis in department stores will grow from a $2.7 billion global market in 2015 to a $9.2 billion by 2020, a CAGR of around 27%. In the US alone, predictive analysis from big data is expected to reach a $3.6 billion market by 2020. As of 2015, less than 25% of department stores had adopted predictive analytics. Between 2018 and 2020 this is anticipated to grow to 70%.

After identifying trends, Big Data (particularly in regards to customer economic and geographic information) can be used to understand where and when this demand will come from. This helps business to generate effective marketing and advertising campaigns. For example, (Russia’s first online retailer) analyzed that demand for books rises when it gets colder during the winter months, and thus increases the number of book ads their customers see. This ability to accurately forecast demand, that comes with using Big Data, is helpful in lowering a business’s costs, as it is expensive to keep excess inventory on shelves and having too little stock drives down revenue and decreases customer engagement and loyalty.

Price Optimization — The third main use for big data in department store retail is optimizing pricing. In retail, Big Data can be used to help assist in determining when prices should be dropped (marked down optimization) or when they can be raised without customer dissatisfaction (reflected by a lack or reluctance to purchase). Previously, before the advent of Big Data, markdowns occurred at the end of the buying season, with stores hugely discounting their remaining merchandise. The problem with this approach is that demand is already gone by the time that markdowns occur. Big Data analytics demonstrate that what is actually most effective in increasing revenues is to gradually lower prices once demand initially begins to decrease. When the US retailer Stage Stores employed this technique, it could increase its traditional end of season sales revenue over 90% of the time.

Weather Optimization — Big Data is particularly helpful in optimizing prices in accordance with weather conditions. The Weather Company (part of IBM) has found that “weather is one of the largest swing factors for economic and business performance” – 60% of shoppers change their behaviors when it is either raining or it is hotter than average out. A 1o F drop in temperatures below 60o produces a 2%-3% drop in apparel sales. Approximately 60%-70% of a retailer’s excess expenses are due to weather-impacted supply chain costs (e.g., trucks held up due to poor weather conditions). In the UK, if temperatures reach over 65o F there is a 22% rise in fizzy drink sales, a 20% rise in juice sales, and a 90% rise in lawn furniture sales. In the US, temperatures below 64o F increase sales in soup, porridge, and lip care. Food, drink, pharmaceutical, and apparel sales are the categories most impacted by weather.

Targeting Individual Consumers — The final use of Big Data in general retail is identifying individual customers and how to most effectively market to and target them specifically – whether through email, text, or location-based alerts. Retailers, for example, can install sensors in their stores to identify customers’ locations through their smartphones. If a customer’s smartphone’s WiFi is turned on, it will attempt to connect with the store’s internet and this is how a customer’s location can be sensed and tracked. Retailers can then track what specific stores she visited, what departments she visited, and what products she purchased at what time and on what date. This information can be used to better understand each customer’s movements and patterns when it comes to shopping. Retailers can then use this information to reorganize their stores to optimize customers’ shopping experience and even to offer special deals and coupons to bring further business to their stores.

General Online Retailers

In addition to the Big Data applications listed above, there are four other applications that apply specifically to online retailers.

Dynamic Pricing — Dynamic pricing is Big Data at its finest. Dynamic pricing is highly responsive to external factors such as consumer demand and competitors’ prices. Dynamic pricing collects trend data about which products are being bought to automatically adjust prices. Its analytic capabilities slowly increase prices on items that are popular and discounts prices on items that are less popular. Dynamic pricing is key to increasing online retailers’ overall revenue.

Individual Customer Experience — Big Data analysis gives sellers insights about customer behavior and demographics and provides customers a personalized experience. For instance, customer data can be used to create buyer-specific e-mails for promotional campaigns. For example,  Amazon’s “Customers who bought this item also bought…” recommendation feature increased sales nearly 30% when it was first implemented. This is a simple and remarkably effective way to keep customers on a retail site and keep them buying. Consumers might have reservations about their favorite retailers knowing intimate details about their lives, but they’re going to love the results in practice. Sharing all those personal tidbits is helping companies like CNA identify fraud and prevent customers from having their identities compromised. Retailers can use information from live transactions and other sources (such as social feeds and geo data from apps) to prevent credit card fraud in real time.

Better Quality of ProductsAmazon is the e-commerce standard when it comes to smart, effective pricing. It can easily access its competitors’ pricing data and respond quickly with its own deals — changing some items’ prices up to 10 times a day. The industry-wide shift to dynamic pricing means that companies will no longer be competing on price alone. They will now need to establish a reputation for offering their customers the best value and the best experience.

Reduce incidents of shopping cart abandonment — Companies can also use cross-device tracking to reduce shopping cart abandonment rates. EBay research found that the average consumer uses as many as three or five devices or platforms during the course of her buying journey. Mapping this journey with data allows retailers to help their customers’ transition from one device to the next and complete their purchases.

You can find more informative sources like this on the SOMAmetrics website under resources. Or click here to schedule a call if you would like to speak with one of our associates.



Big Data is transforming the Auto Industry

Big Data in Auto Industry

The next few years are going to see an explosion in the rate at which detailed data is collected about the moment-by-moment operation of nearly all new cars. This data will be stored and collated in centralized databases that make Big Data analyses possible. McKinsey published a report in 2014 that estimated that the global market for connectivity components and services for cars was $38 billion that year. The report went on to project that the data-driven connected car industry would grow to $215 billion.

Big Data Will Transform Fleet Management

Fleet management will enjoy the greatest benefit from this Big Data analysis. Auto makers will now be able to determine which settings and features drivers actually use. This will help them improve their marketing. It will also identify the features that drivers really care about; this will focus the auto makers’ on-going R&D efforts.

Further, automakers can easily monitor their cars, identify potential problems, and issue maintenance calls. This will help maintain their fleets in peak performance. They will be able to identify drivers who are abusing their cars; they can issue advisories based on that information. All of these efforts are geared to minimizing the maintenance costs and maximizing the performance of their fleets.

Big Data Is Transforming At least Five Other Auto Practices

City Planning — City planners and engineers can use this same data to improve their plans for roadways and traffic flows.

Onboard Navigation — Navigation systems can use real time driving data to discover and display the fastest routes based on current traffic patterns.

Insurance Rates — Insurance companies can access the Big Data collected from connected cars to monitor each driver’s performance and, potentially, use this information to adjust rates and to determine what really happened in accidents.

Auto Dealer Marketing Campaigns — Dealers can use this Big Data to assist in planning their marketing campaigns. For example, Bullseye Prospecting is a product that helps dealers and their marketing agencies automate their marketing campaigns by leveraging third party and internal data on consumer behavior, incentives, and vehicle equity/valuation. This prospecting tool can cut the $600-$800 average per-car cost of sale by about 30%. It also helps dealers by sending a detailed, personalized message to their best customers at precisely the right time to prompt sales and services.

Used Car Valuations and Inventory Management — The 2016 Black Book survey indicated that nearly two-thirds of dealers are using 30%-50% more data since 2014 to establish vehicle valuations. This data also helps them set regional pricing, determine the appropriate supply of cars, and assess each vehicle’s history to manage their inventories. Some 69% of these dealers say the data is giving them better insights on pricing and profitability. 58% say Big Data is providing better insight into managing their inventory procurement. The majority of dealers believe they can avoid a market catastrophe similar to the one in 2008 because the data allows them to make more accurate decisions.

Cheat Sheet: Big Data and Mortgage Companies

mortgage industry

It’s common knowledge that Big Data has arrived in the Mortgage Industry. One of the most important questions leaders in our industry need to ask themselves, of course, is “Where is it all going?”. We’re going to give you our take on this issue in just a moment. But first, let me give a short synopsis of what Big Data is and how it can greatly benefit mortgage companies.

What is Big Data?

Historically, all the data computers used was set up in highly structured data bases. In other words, we had separate fields for each piece of data and we spent a lot of time and effort to make sure all the data was clean and accurate.  Big Data does away with that. Big Data reads data that was never meant to be analyzed by a computer.  This includes everything from Tweets and Facebook postings to newspaper clippings. All of these were written for human consumption, not for computer processing.

Big Data cut through that.  Big Data is able to read all of this unstructured, messy stuff that was never meant for computers and then makes sense of it.  It other words, it can read Tweets and Facebook postings and data from hundreds of different sources that are written in incompatible styles and assign meaning to what it’s reading.  In the mortgage industry, this means that we can now tap into huge reservoirs of information that were always available to us before – data that is in the public domain, but we could never get a computer to work with it.

Now let’s take a look at where Big Data is going to take the mortgage industry.


Big Data Can Be Used to Streamline Processes at Mortgage Companies

  1. Pre-populate mortgage applications

We believe that Big Data is going to pre-populate mortgage applications. In other words, Big Data will mine data from bank records, publicly available data bases, social media sites, and other sites to collect all or nearly all the information required for a mortgage application. This will leave the applicant with the option of either clicking to ratify the pre-populated application as accurate or, on the other hand, edit a few fields here and there to fine tune the application.

Another approach here is for prospective home owners to complete their mortgage applications as they always have and then the mortgage company’s computers will compare the pre-populated versions with the applicants’ versions to identify discrepancies.

In either case, the objective of this exercise will be to enhance the accuracy of the data in the applications at the same time the system reduces the burden on the applicants and mortgage companies.

  1. Computer algorithms to score mortgage applications

We can also see that computer algorithms will score mortgage applications using machine learning algorithms. These algorithms will approve or deny the applications immediately. Approved applications may be forwarded for processing right away. Rejected applications will qualify for a human review if the applicants don’t feel they have been scored properly. The goal of this instant evaluation will be to eliminate the delays in the current manual evaluation process – delays that are often measured in weeks.

We can see that Big Data will be instrumental in projecting the number of applications for new mortgages or refinanced mortgages in specific geographies and specific time frames. Further, Big Data will project the total value of these mortgages. These projections will help mortgage companies reposition their people and processing power based on projected market demand.  These projections would be based on the current mortgage portfolio the industry has in place in various geographic areas coupled with scenarios about shifts in mortgage interest rates.

  1. Big Data analysis of non-monetary defaults

We can expect to see Big Data analysis of non-monetary defaults on mortgages to become more common if not universal.  Here, I’m talking about flagging accounts where payments were made early and with an extra principal payment to being made on time with no extra payment. Or we will find homeowners whose home owner association is suing them. Or maybe the local government put a lien on the property on the grounds that the property is uninhabitable. Or the couple is getting divorced.  These are all early warning signs that Big Data will track as a matter of course. 

  1. More Objective Residential Property Appraisals

Residential property appraisals will become more objective and more accurate. Big Data will propose the most appropriate neighborhood comparable. It will develop appraisals using industry standards that will be driven by an algorithm. MReport claimed that, “More than 30 percent of loans fall short of the collateral valuation agreed to between customer and loan officer.” Big Data will help fix that.

Big Data is bringing big changes

How is the Business of Big Data Affecting the Mortgage Industry? 

  1. Increase in spending on Big Data

Spending on Big Data applications and technology will soar.  In 2014, 2015, and 2017, we’ve seen Big Data spending in the mortgage industry at $2.6 billion, $2.8 billion, and $3.2 billion respectively. We are going to see spending on Big Data continue to climb as the number of success stories grows.

  1. Increased need for big data analysts in mortgage companies

Mortgage companies are going to suffer a severe shortage of Big Data analysts who know how to manipulate the huge and ever-growing quantities of data that will become available. We are going to need professionals who can manage the enquiries in ways that lead to highly defensible conclusions.  The growth in the demand for Big Data analysts is going to outstrip the supply.

  1. Increase in consultants

We are going to see the rapid growth of specialized firms that assist mortgage companies plan for and implement Big Data projects.  This function is going to outsourced rather than treated as a core competence for several reasons.  First, most mortgage companies will find it far too expensive to build their own in-house facilities.  Second, the process of building their in-house facilities will take too long and are liable to face many dead-end alleys. Third, they will not be able to attract the talent they need at a price they can afford.  Fourth, the management in existing mortgage companies will need to go through a steep learning curve that is best handled by a specialized firm.  Over time, we can expect mortgage companies to build teams of in-house Big Data talent while leaving the technologies to cloud-based firms.

As a result, Small mortgage companies that cannot afford to buy the necessary technologies will be squeezed out of business. Larger companies will buy them.

  1. Automation and Big Data will be important for mortgage companies

Mortgage companies are going to increasingly focus on building higher quality portfolios with fewer staff.  The only way to have a smaller staff complement and a larger mortgage portfolio is through automation.  That should be obvious.  Automation in general and Big Data in particular is the way of the future.

*Warning*: New Players

We are going to see many new, non-traditional players in the mortgage industry.  They will spring from places like Silicon Valley.  They will offer better service at lower costs than banks and traditional mortgage companies. For example, the Lending Club facilitated $3.6 billion in loans in the first six months of 2015.  Likewise, Prosper is growing fast.

How Does Big Data Help Mortgage Companies Keep up with New Regulations and Laws?

We can expect the Federal Housing Administration to develop a growing number of regulations that the mortgage industry must comply with.  Many of these regulations will apply to a company’s portfolio of mortgages rather than any given mortgage.  Mortgage processors will continue to ensure that they comply with application specific compliance issues, but they cannot be expected to deal with portfolio-wide compliance issues.  In fact, it is unlikely that it is humanly possible to do so.  This means that mortgage companies will necessarily embrace Big Data to do that job for them.  Failure to do so means that they will face stiff penalties in court.  It is far better for these companies to catch non-compliance failures on their own and take action than to face their regulators in court.

Carl Pry, a managing director at Treliant Risk Advisors, said “It’s in every bank’s best interest to get one step ahead of the regulators and understand what that regulator is going to know and find. They need to resolve any discrepancies [and] do any file review analysis needed to be able to explain any disparities before the regulators find them.”

Here are a few more examples of how Big Data helps keep Mortgage companies out of legal trouble:

  • New regulations and compliance issues are making the appraisal process increasingly difficult. That, coupled with the fact that the number of qualified appraisers is not keeping up with the demand, means that the industry must necessarily rely on broad based, sophisticated tools like Big Data. This trend will continue.
  • Big Data is going to prove instrumental in flagging potential fraudulent mortgage transactions. The FBI and other law enforcement agencies are developing increasingly sophisticated techniques to identify potential abuses. Big Data algorithms will incorporate these fraud detection techniques into their algorithms and trigger pre-emptive enquiries.

Big Data, Mortgage Companies, and the Mortgage Buyer: How Relations Can Be Vastly Improved

Decades ago the local bank manager knew his customers well and was in a position to make an informed judgment call about the amount of credit to be extended.  Bank managers rarely make those decisions in retail bank branches and mortgage companies today.  Rather, those decisions are made by a committee – often in another city.  We need to reinvest some humanity into the decision-making process. Incorporating social media will go a long way in that direction.

The mortgage approval process is going to become more transparent. At the moment, borrowers only know whether they are approved or rejected, but they rarely have an idea why they were slotted where they were.  In the future, mortgage companies will be in a position to coach their applicants very specifically about what they need to do to be approved.

Additionally, Big Data is going to help reduce the risk in mortgage lending. Big Data will help brokers advise their clients about school performance and community crime rates. This will help the buyers make better-informed decisions and, ideally, lead to lower risk mortgages.

**Warning:**Potential future issues: Privacy

The privacy issue is going to become a big issue in Big Data.  Although everything Big Data practitioners do is legal, the act of mining social media on a wholesale basis was never considered when social media sites were first introduced.  We are going to see some interesting and instructive debates on ethical issues over the next decade before we see a consensus emerge.  Any legislation passed before those ethical debates come to closure will prove to be ill-conceived and counterproductive.

Conclusion: Big Data and Mortgage Companies

Just to wrap up, I want to make it clear that Big Data is already having an impact on how the mortgage industry operates and we are still at the early stages. We are going to be in for a very interesting ride over the next few years.

If you want to learn more about this, feel free to get in touch with me directly.  I’m Eskinder Assefa, CEO of SOMAmetrics in Berkeley, California. We work with mortgage companies to help them realize their full business potential by improving their sales and marketing strategies and leveraging emerging technologies that have an impact on the bottom line.

Big Data is Getting Mortgage Companies the Right Info

big data helping transform the mortgage industry

The problem

In every other industry besides the mortgage industry, buyers know exactly what they are buying before they lay their cash on the table. Car buyers can read Consumer’s Reports and drive the car around the block. Camera and computer buyers can download YouTube reviews of any product on the market in less than 30 seconds. Mortgage originators do their best to collect all the information they can to determine whether a prospective mortgage will be paid as agreed.  They have their standard checklists of questions and they are free to ask more questions as the application process goes on.  But once the mortgage is put in place, the only way to see if the payments are made on time is to track actual payments.  No one can tell the future.  No one can tell if a mortgage holder is going stop paying.  No one can tell the future. Or at least that used to be the case.  Big Data is changing that picture.  Big Data can help us look into the future with some degree of certainty.  But before we get into how that works, let me give you a brief run down on what Big Data is.

What is Big Data?

Historically, all the data computers used was set up in highly structured data bases. In other words, we had separate fields for each piece of data and we spent a lot of time and effort to make sure all the data was clean and accurate.  Big Data does away with that. Big Data reads data that was never meant to be analyzed by a computer.  This includes everything from Tweets and Facebook postings to newspaper clippings. All of these were written for human consumption, not for computer processing. Big Data cut through that.  Big Data is able to read all of this unstructured, messy stuff that was never meant for computers and then makes sense of it.  It other words, it can read Tweets and Facebook postings and data from hundreds of different sources that are written in incompatible styles and assign meaning to what it’s reading.  In the mortgage industry, this means that we can now tap into huge reservoirs of information that were always available to us in the public domain, but we could never get a computer to work with.

Big Data is the Solution the Mortgage Industry Needs

Today, Big Data can tell mortgage companies whatever they want to know about the people who hold mortgages with them.  Big Data can operate as a kind of “distant early warning system” for account servicers.

1.Spending Analyzation

Big Data can look at the shops where your mortgage applicants buy their clothes and watches. Then it can determine whether those shops are in line with their stated incomes or are splurges.  That’s not to say there is anything wrong with an occasional splurge, but if someone consistently spends beyond her earnings, then something is wrong.

2.Social Media Analyzation

We all know the old adage that “birds of a feather flock together.” So, when you know who someone’s friends are, you know a lot about that person.  And where can you find out who someone’s friends are more easily than on Facebook? Big Data can collect a list of your applicants’ friends, build profiles, and assess applicants.  That assessment could accelerate the application approval or be instrumental in squashing it. Knowing the applicants’ friends can offer a second order benefit. If the company approves an applicant’s mortgage, then it can approach each of her friends as well.  This can be particularly lucrative for subprime mortgages.

3.Website Analyzation

Even knowing the websites your applicants visit is fair ball. Applicants who say they want to settle down and build a career but have recently spent a lot of time on overseas travel websites and airline websites are suffering some sort of a discontinuity.  It’s better to discover that earlier rather than later.  

4.Holistic Customer Account Analyzation

Big Data can look at the actual spending patterns of mortgage applicants and see if they are in line with their stated income.  If their spending is too high, they might prove to be good prospects for subprime mortgage at higher interest rates. Banks have historically operated in a highly siloed way.  What I mean is that the department that handles checking and savings accounts knows nothing about their customers’ mortgage accounts, car loans, or children’s tax deferred education savings programs. Big Data can pull this data together across the bank’s own internal databases without violating any confidentiality agreements.  This enables bank agents to make offers to their customers that are right on target.  Imagine a customer who has been surfing new car websites for several weeks but has not asked for a loan – yet.  When she stops into the bank on another matter, the teller could raise the question of a car loan, tell her the extent to which she has been preapproved, and direct her to the office that has already prepared the paperwork.  

So what’s the hold-up?

In spite of these advantages, only 38% of banks in 2013 were using Big Data that way, according to a survey Celent conducted that year.  There is no doubt that percentage has increased during the last four years. Some see the collection of this online data to be an invasion of privacy – and perhaps it is.  The jury is still out.  But as long as this information is in the public domain, it is hard to justify the argument that there is anything underhanded going on here. Nevertheless, customers who want to guard their data more carefully are free to limit access to their social media data to their “friends.” They can also instruct their browsers not to maintain histories or maintain “cookies.” This carries a cost, of course. It’s often very handy for a computer user to rely on her browser to maintain user names and passwords to accelerate logins. Full disclosure of web activity does not necessarily hurt customers, either. A bank could notify a user by email when someone is using her debit card to make a purchase that is out of character with her routine spending patterns.  If there is no cause for alarm, she could simply ignore the alert.  But if it is a threat, she could act immediately. By having a full picture of each customer’s browsing behavior as well as online and offline spending patterns, banks and other financial organizations can tailor offers that are genuinely appropriate and tailored to each customer.

The Future of the Mortgage Industry and Big Data

In the future, we can expect the mortgage industry to use Big Data to access an ever-wider range of publicly available information to build an increasingly comprehensive profile of each customer. It will integrate arrest records, bankruptcy records, credit records, court judgments, property ownership, and library fines available from publicly available online data bases. We can also expect companies in the business of buying existing mortgages to handle their own due diligence using Big Data. Each mortgage for sale may become more of less attractive over time depending on the recent behaviors of their mortgage holders. If you want to learn more about this, feel free to get in touch with me directly.  I’m Eskinder Assefa, CEO of SOMAmetrics in Berkeley, California. We work with mortgage companies to help them realize their full business potential by improving their sales and marketing strategies and leveraging emerging technologies that have an impact on the bottom line.

Six Ways Big Data Impacts the Mortgage Industry

mortgage and big data

There are six ways Big Data impacts the mortgage industry. And what I’m going to tell you about now is just the leading edge of a transformation that that is going to make all the difference between the winners and the losers in this industry.

But before I get into that, let me give you a very brief explanation of what Big Data means for readers who don’t have any experience with it, and how Big Data impacts industries today.

Historically, all the data computers used was set up in highly structured databases. In other words, we had a separate field for each piece of data and we spent a lot of time and effort to make sure all the data was clean and accurate.  Big Data does away with that. Big Data reads data that was never meant to be analyzed by a computer.  This includes everything from Tweets and Facebook postings to newspaper clippings. All of these were written for human consumption, not for computer processing.

Big Data cut through that.  Big Data is able to read all of this unstructured, messy stuff that was never meant for computers and makes sense of it.  It other words, it can read Tweets and Facebook postings and data from hundreds of different sources that are written in incompatible styles and assign meaning to what it’s reading.  In the mortgage industry, this means that we can now tap into huge reservoirs of information that were always available to us in the public domain, but we could never get a computer to work with.

Big Data opens up new and exciting opportunities in the mortgage industry. I’m going to tell you very briefly about six of them.

1.Account Origination and Underwriting

The first is in the area of account origination and underwriting.  In the past, we only looked at the information applicants gave us on their mortgage application forms.  Of course, we would verify the information whenever we could by contacting third parties. But the fact is that we didn’t – and couldn’t – go beyond that.  Some pundits have estimated that this data really only represents about 5% of what we should take into account when deciding to approve a mortgage.

There are many millennials who don’t use banks as much as their parents did. This means they don’t leave the banking breadcrumbs that the folks who make decisions about mortgage approvals like to follow. But, in a larger sense, these millennials do qualify for mortgages because they have the potential to meet their monthly mortgage payments.

Big Data impacts the mortgage industry by giving us the tools we need to mine data sources that were unavailable before.  I’m talking about social media data and actual financial purchasing patterns. When we learn to factor in the data about mortgage applicants from a far wider range of sources, we’ll be able to make better informed decisions about approving mortgages.

2. Big Data Impacts Account Servicing

Account Servicing is a low margin business with lots of transactions. We never know when things go wrong with an account until they do go wrong.  We don’t have any way to know that an account is in distress until the payments start coming in late – or don’t come in at all. On the other hand, we don’t know that a couple is ready to move up to a larger house and a larger mortgage until they file a new mortgage application.

Big Data impacts us in that it can help us be proactive in these situations by monitoring publicly available data and drawing some conclusions about problems and opportunities coming up soon.

Big Data impacts the way we perform account servicing. Big Data allows us to pull together data from different sources in different formats to predict when trouble is in the offing. We can track household spending patterns from our clients’ credit card records. We can see drops in their incomes by tracking their bank account deposits. We can find out when they lose their jobs by tracking their Facebook and LinkedIn accounts.

By the same token, we can use these same sources to determine when a baby is on the way and a couple is likely to take a step up in housing. Or maybe a client announces his promotion or new job with a new employer on his social media.  Maybe our Big Data engines track an announcement of a promotion in the business section of the local newspaper. These are indicators that your client will soon come knocking on your door asking for a larger mortgage.  Or maybe you should knock on her door to let her know that you’ve already qualified her for a larger mortgage.

3. Cross Selling

Cross Selling is a third area of opportunity that Big Data impacts and can facilitate. People outside the banking industry are shocked to learn that their banks operate as a collection of silos that happen to share a piece of real estate.  Here’s what I mean. The department that handles checking and savings accounts has no connection with the mortgage department. The department that handles lines of personal credit are divorced from the other departments. And it goes on and on like this throughout the bank.  A bank that can build a single view of its customer – what bankers call an SVC – can cross sell financial products that include a mortgage, a car loan, and a child’s education fund.

But let’s say that you work for a mortgage company that only offers mortgages. In a situation like this, you could identify cross selling opportunities and refer them to partners in other companies. When your partner closes a piece of business, you can pick up a finder’s fee or a referral commission.  This wouldn’t be possible without Big Data.

4. Risk and Regulatory Reporting

Risk and regulatory reporting is a fourth fertile area. After the crises in 2008, the Federal Housing Administration put several programs in place to protect mortgage borrowers. These include the Home Affordable Refinance Program or HARP for one. Another is the Home Affordable Modification Program or HAMP. Others are the Short Refinance Program and HAPA. Mortgage companies need to make sure they don’t run afoul of these programs. But here’s the catch: when mortgage companies are managing their day-to-day business, they’re focusing on only one mortgage application at a time. No one is looking at the makeup of the full portfolio of applications at the detail level. And that’s exactly why mortgage companies are shocked to learn that they are violating FHA guidelines. This is where Big Data can come to the rescue. Big Data can look at your full mortgage portfolio and test it against the full range of the compliance terms in each of these programs. Big Data can highlight upcoming problems before regulators do. Mortgage companies that proactively identify these compliance issues are going to have a far better year-end than those who find they are out of compliance when they are facing a regulator in court.

5. Mortgage Fraud

Mortgage Fraud is a fifth opportunity area. Here I’m talking about subprime fraud, property valuation fraud, and foreclosure fraud. Fraud is rare because, by and large, people are honest and they act in good faith. But, every once in a while, you’re going to come across a fraudulent transaction – and that can be expensive. You are far better off if you can catch the fraud early on and deal with it immediately.

Again, this is where Big Data impacts the mortgage industry. Big Data can look for patterns and discontinuities in those patterns. The FBI and other law enforcement agencies are always developing sophisticated techniques to identify potential fraud. We can harness those techniques in Big Data algorithms to analyze the mortgage applications under consideration.  The Big Data algorithms will run in the background and let your staff get on with the business of approving your mortgage applications.

6. Corporate Acquisitions

The last area I want to talk about is corporate acquisitions. Think back to the period right after the 2008 collapse.  At that time, we saw Bank of America buy Countrywide Financial and then go on to lose $40 billion from that little acquisition.  The same sort of thing happened with Washington Mutual. The company failed and the government took it into receivership. JP Morgan bought Washington Mutual and took it out of receivership.  Within a couple years it became clear this was not a very good business decision at all.  In fact, JP Morgan eventually sued the Federal Deposit Insurance Corporation over that purchase for non-disclosure.

If these companies had access to Big Data tools and used them wisely, they could have avoided these debacles.  Big Data could have sifted through the huge masses of data available in both structured and unstructured forms and uncovered these problems well in advance.  It’s not as though anybody was trying to hide anything.  All the information was there for anyone to see.  The real problem was that there was so much information in so many different forms prepared for so many different purposes that it was virtually impossible for anyone – or even a team – to sift through the material and highlight the problems that would be evident if you just knew where to look.

In Summary: Big Data Impacts

In summary, I think it’s clear that we already have some compelling evidence that Big Data impacts the mortgage industry significantly when it has been applied wisely.  These initial successes are going to spur others on to use this rapidly developing technology more widely in their own companies.

If you want to learn more about this, I urge you to read what Vamsi Chemitiganti from Hortonworks has written on the subject.  Or feel free to get in touch with me directly.  I’m Eskinder Assefa, CEO of SOMAmetrics in Berkeley, California. We work with mortgage companies to help them realize their full business potential by improving their sales and marketing strategies and leveraging emerging technologies that have an impact on the bottom line.

Big Data Case Studies in Education

big data case studies

Big Data Case Studies with Proven Results

Big Data Case Studies: Coursera

Coursera provides education from leading universities around the world delivered over the internet. The instruction is handled through data streaming videos. Coursera tracks how its students watch those courses. Students might “rewind” to watch a section a second time. Or they might fast forward – skipping stuff they think they already know. Or they might go over the same course several times. Or they might just quit and walk away. Whatever they do, Coursera tracks it on a student-by-student basis. The company learns from this experience. It learns what works and what doesn’t. Occasionally it throws in a pop-quiz to see how well the students are learning. But there’s another reason, too. The company wants to see how well it’s doing. It’s a kind of self-evaluation. When the course designers realize that the learning process is not going as they had expected, they can go back and rework their material based on real-world feedback.

Big Data Case Studies: Arizona State University

Arizona State University, like many universities across the country, has its fair share of freshman students who are genuinely challenged in mathematics. One third of their freshman classes earned less than a C in math. Interestingly, this one score has been a reliable indicator of whether students would eventually graduate and collect their degrees – or drop out. To deal with this, ASU worked with Knewton apply its adaptive learning techniques. In just two years – from 2009 to 2011 — the pass rate in this course jumped from 64% to 75% at the same time the dropout rates fell by 50%.

Big Data Case Studies: West Virginia University

Simon Diaz, a professor at West Virginia University, was very curious why so many students who enrolled in online classes dropped out. One of the key rationales for providing online classes with streaming video at times convenient for the students was that the students wouldn’t feel shackled to a schedule that was incompatible with the daily realities of their lives. Using Big Data analytics, he looked at 33 variables for more than one million students. These variables included everything you would expect like age and gender to things you wouldn’t expect like military service and class size. What he discovered had never been obvious to anyone else before. The more classes students took at any one time, the more likely they were to drop out. Simply by reducing the number of courses students enrolled in at any one time would increase retention rates. But financial grants to students require those students to take a minimum number of courses. In other words, public policy was at odds with good educational practice – a conundrum that no one had discovered before based on a policy that had probably never been thought through with any empirical evidence. Another win for Big Data in Education.

Big Data Case Studies: Kent State

Kent State uses analytics to track student activity and project the likelihood of success. It tracks students over a ten-year time period collecting data about their majors, classes, demographics and other factors. Their system highlights the students at risk with red, yellow, green indicators. The reports help advisors focus their efforts on problem areas. Steven Antalvari, Kent State’s director of academic engagement and degree completion, said, “This data has helped us peel away certain layers faster, allowing us to spend the bulk of our time together working on the student’s purpose, goals, and career development.”

The Top Players in the Education Technology Industry

education technology

Here are some of the well-recognized in the Education Technology (EdTech) sector – in no particular order. These are the companies that are developing the paradigms that will shape Big Data in Education. They are also the companies that are developing the technologies to implement those paradigms and offer them to educational institutions.

This is important because, until recently, schools have not needed to look outside their own walls for the tools they needed to do their work. The obvious exceptions are textbooks and, starting some 60 years ago, general purpose computers.

Educase Conference – This company is growing fast. It offers systems to store Big Data in a cloud and perform analytics on that data to make sense of it for administrators.

SAS is a well-established company that dominates the advanced analytics industry with almost 32% of the market.

Renaissance Learning A few years ago this company was sold for $1.1 billion. Renaissance is a testing and student data company. At the time of its sale, it had data on the test results for 10.1 million school age children.

InBloom This company is a middleman between school districts and education technology companies. It handles the data storage and distribution of student data to authorized users. The Bill and Melinda gates Foundation and the Carnegie Corporation of New York had so much confidence in this venture that they kicked in $100 million. However, social concerns about data security grew to a fever pitch and the company withdrew its offerings.

Coursera Coursera is a start-up company that offers courses over the Internet. It offers accredited courses from Illinois, the University of Pennsylvania, Johns Hopkins University, the University of Michigan, Stanford University, UC San Diego, Duke University and 150 other universities around the globe. Students can even earn a master’s degree in business, accounting, data science, and entrepreneurship through Coursera.

Noodle Noodle is based in New York. The company offers fact-based information to help prospective students choose an elementary school, a graduate school, a summer camp, or even a tutor.

Knewton – According to its website, Knewton recognizes the value of adaptive instruction and education technology: “Individual students bring different skills and different challenges into the same classroom. Knewton’s pioneering approach to adaptive learning draws on each student’s own history, how other students like them learn, and decades of research into how people learn to improve future learning experiences.

Big Data in Education: Full of Promise, Uncertain Future

big data in education

This is What Big Data in Education Looks Like

One educational practitioner used Big Data to catch an anomaly in a course that was designed to progress smoothly from one module to the next. He found that the students in the class progressed from module 1 to module 7 as expected. At that point, however, most of them went back and replayed module 3 again. It became very clear that the material in module 3 hadn’t “stuck.” This led the course developers to revisit that module and upgrade it. They did this even though none of the students complained about that module. By monitoring what students actually did on a massive scale, the company saw an opportunity to upgrade its course and did it. In this blog, we will discuss the importance of big data in education.

In another instance, students were stumbling on a particular question and were notified immediately that they missed it. Many of those students read the related forum material, reworked the quiz, and got the answer right. When the course instructor discovered this through Big Data analysis, he inserted a recommendation in the course for students who got the answer to that question wrong: He referred them to the forum post that had proven useful for everyone else.

Spanish speaking students studying English via Duolingo would stumble and fall when learning the English pronouns he, she, and it. This led to high dropout rates. Why? Well, Spanish doesn’t have an equivalent to it. This was a new concept – and new way of thinking – for Spanish speaking students. The solution was simple. The course postponed the introduction of the word it for a few weeks and student retention soared.

New York City has a program School of One. In this school, each student gets his own playlist of modules to study. The students need to learn math. They go at their own speed. If one module doesn’t do the trick for them, they try another. Now, the real question is, “Does it work?” Well, independent studies by a private educational service reported that students who went through this program did substantially better than those who did not. Yes, it works.

What is Big Data in Education?

There are two major areas of interest in the field of Big Data in Education: institutional and educational.

Institutions collect masses of data from traditional sources as well as new sources to develop their policies and plans. The new sources include Facebook posts and Twitter tweets to get a sense of the sentiments among current students, prospective students, and the community at large. The institution can also pick up macroeconomic and microeconomic data that are useful but were prohibitively expensive to include before.

Educational or instructive purposes are intended to personalize the learning process for each individual student. Here, schools at all levels can collect detailed data about students’ progress through their learning journey on a moment by moment basis. The idea is that the system can identify when a student is caught in a vortex that prevents her from making progress. At that point, the system could notify the teacher about the problem a student is having at the moment it occurs. On the other hand, the system could be designed to introduce a tutorial that deals with the problem area as it occurs, not weeks later when a failing score highlights the students’ learning problems.

This student oriented real-time instructional intervention has several benefits. First, it helps the students well before frustration, disillusionment, and failure set in. This helps the students to become proficient in the subject material – even master it. It also has the benefit of assisting the teacher to focus her attention on just the sort of help that is needed. This is particularly beneficial in large classrooms. Students’ success in the classroom will lead to them staying in school and gravitate toward matriculation. Success in school is correlated with success in the work place.

In addition, as students stay in school and graduate, the school builds its reputation as a place where students can come to succeed. This attracts new students. Further, by keeping students in school until graduation, the school improves its revenues as well as its reputation. These retention and graduation rates loom large in school evaluations.

The US is Not a Leader

Ironically, the US is not a leader in education when compared with other developed countries in the world. We have seen class sizes in the public schools grow to the point that teachers can no longer provide individual attention. Funding to schools at all levels is constantly being cut back. There is no question that America is home to some of the leading universities in the world, but those universities are not characteristic of the country as a whole.

One way of dealing with this growing gap between the quality of education and access to education in the US compared to other countries is to adopt distant education and Big Data technologies. These technologies promise to offer education aligned with the students’ schedules, not the class room schedules. Further, it promises to offer meaningful tutorials on problem areas tailored to each student as and where they are needed. These benefits are likely to be compelling in informing educational policy.

AltSchool May Be the Extreme

If you want to know what Big Data in Education on steroids looks like, look at AltSchool. This San Francisco Bay Area company will record everything about their students while they are in school. That means EVERYTHING. It will track how they go through their learning experiences – heart rate, eye movement, facial expressions, movement from one part of a computer screen another, how long their mouses hover over items on their screens. They will record every word. Almost every thought. Everything. All of this data is then fed into a Big Data database. Top notch data scientists will comb through this data to learn in detail how to personalize the learning experience for each child.

AltSchool might find that some students improve their mathematics studies after exercising in the schoolyard. Or a student starts incorporating new words in her vocabulary after watching a particular video. Then the school will incorporate those insights into the student’s daily routine and see whether the benefits persist over time. In fact, the school planners would use Big Data to look for an ongoing series of tweaks they could make. That will provide a stream of changes that may (or may not) provide enduring value. This is personalization at the extreme.

Some argue that this sort of super tight oversight smacks of Big Brother — and maybe it does. But if it pays off in terms of enhanced results for the students, then it may be worthwhile. There are probably hundreds of practices we respect in everyday life today that may have seemed strange – even objectionable – our forefathers a few generations ago. For example, it was less than three generations ago that it was common practice to whip children who performed poorly in school; today that practice would be unheard of.

Huge Investments

GSV Advisors estimated that the e-learning market in the US is over $100 billion. Further, it’s growing at 25% a year. Well-established companies like McGraw-Hill, News Corp., Pearson, and Kaplan have spent billions to get into this market. Further, there are a lot of start-up companies mushrooming in this space as well. We’ve listed just a handful of some of the notables below, but there are many other worthy companies that didn’t make this list