With Delta-p statistics, the predictions based on a logistic regression model are easy to understand by non-technical decision-makers.
Learn how to calculate the Delta-p statistics based on the coefficients of a logistic regression model for credit application processing.
Data workflow includes the steps for accessing the raw data to training the logistic regression model, and evaluating the effects of individual predictor columns with Delta-p statistics.
Keep in mind logistic regression might not be the best choice when working with high dimensional data, with many correlated predictor columns.
Imagine a situation where a credit customer applies for a credit, the bank collects data about the customer – demographics, existing funds, and so on – and predicts the credit-worthiness of the customer with a machine learning model. The customer’s credit application is rejected, but the banker doesn’t know why exactly. Or, a bank wants to advertise their credits, and the target group should be those who eventually can get a credit. But who are they?
In these kinds of situations, we would prefer a model that is easy to interpret, such as the logistic regression model. The Delta-p statistics makes the interpretation of the coefficients even easier. With Delta-p statistics at hand, the banker doesn’t need a data scientist to be able to inform the customer, for example, that the credit application was rejected, because all applicants who apply credit for education purposes have a very low chance of getting a credit. The decision is justified, the customer is not personally hurt, and he or she might come back in a few years to apply for a mortgage.
In this article, we’ll explain how to calculate the Delta-p statistics based on the coefficients of a logistic regression model. We demonstrate the process from raw data to model training and model evaluation with a KNIME workflow where each intermediate step has a visual representation. However, the process could be implemented in any tool.
Assessing the Effect of a Single Predictor with the Delta-p Statistics
Logistic Regression Model
When we use the logistic regression algorithm for classification, we model the probability of the target class, for example, the probability of a bad credit rating, with a logistic function. Let’s say we have a binomial logistic regression model with a target column y, credit rating, with two classes that are represented by 0 (good credit rating) and 1 (bad credit rating). The log odds of the target class (y=1) vs. the reference class (y) is a linear combination βx of the predictor columns x (account balance, credit duration, credit purpose, etc.). A logistic function of βx transforms the log odds into a probability of the target class:
where β is the vector of coefficients for the predictor columns xin the logistic regression model that predicts the target class y.
The target and reference classes can be arbitrarily chosen. In our case, the target class is “bad credit rating,” and the reference class is “good credit rating.”
If the single predictor column xi is continuous, the coefficient βicorresponds to the change in the log odds of the target class when xi increases by 1. If xi is a binomial column, the coefficient value βi is the change in the log odds when xi changes from 0 to 1. The change in the probability of the target class is provided by the logistic function, as shown in Figure 1.
Figure 1. Logistic function modeling the probability of the target class y=1 as a function of one continuous predictor column xi
The Delta-p statistics transforms the coefficient values βi into percentage effects of single predictor columns to the probability of the target class compared to an average data point e.g., an average credit applicant.
By definition, the Delta-p statistic is a measure of the discrete change in the estimated probability of the occurrence of an outcome given a one-unit change in the independent variable of interest, with all other variables held constant at their mean values. For example, if the Delta-p value of a predictor column xi is 0.2, then a unit increase in this column (or a change from 0 to 1 in a binomial column) increases the probability of the target class by 20 %. The following formulas show how to calculate the prior and post probabilities of the target class and, finally, the Delta-p statistics as their difference1:
Use Case: The Effect of Credit Purpose and Current Account Balance on Credit Rating
Let’s now demonstrate this with an example, and check how the credit purpose and balance of an existing account improves or worsens the credit rating. We use the German credit card data provided by the UCI Machine Learning Repository. The dataset contains 21 columns that provide information about demographics and economic conditions of 1,000 credit applicants. Thirty percent of the applicants have a bad credit rating, and 70 % have a good rating. You can download the data in .data format by clicking “Data Folder” on top of the page, and selecting the “german.data” item on the next page. The german.data file can be opened in a text editor and saved, for example, in csv format. The column names and descriptions of the values in the categorical columns are provided in the german.doc file, accessible via the same “Data Folder” page.
The workflow in Figure 2 shows the process from accessing the raw data to training the logistic regression model, and evaluating the effects of individual predictor columns with Delta-p statistics. The process is divided into the following steps, each one implemented within a separate colored box: Accessing data (1), preprocessing data as required by a logistic regression model (2), training the model (3), and calculating the Delta-p statistics based on the model coefficients (4). In the preprocessing step, we convert the target column from the 1/2 notation to “bad”/“good.” We also transform two originally multinomial columns into binomial columns: We encode the “checking” column into two values “negative”/“some funds or no account” based on the status of the existing bank account. We encode the “purpose” column into values “education”/“no education” to assess the effect of education as a credit purpose. Finally, we handle missing values and normalize the numeric columns in the data.
Figure 2. The process from accessing raw credit customer data to training a credit rating model, and to evaluating the effects of predictor columns to the credit rating with Delta-p statistics. This solution was built in KNIME Analytics Platform, and the Assessing Effects of Single Predictors with Delta-p workflow can be inspected and downloaded on the KNIME Hub.
Figure 3 shows the coefficient statistics of the logistic regression model, reproducible in any tool. The “Coeff.” column shows the coefficient values for the different predictor columns, 0.683 for purpose=education. The “P>|z|” column shows the p-values of the coefficients, 0.055 for purpose=education. This means that education as a credit purpose increases the probability of a bad credit rating, since the coefficient value is positive, and this effect is significant at 90 % significance level, since the p-value is smaller than 0.1.
Figure 3. Coefficient statistics of a logistic regression model that predicts the credit rating good/bad of a credit applicant
By looking at the coefficient statistics of the logistic regression model, we find out that education as a credit purpose increases the probability of a bad credit rating compared to other credit purposes. In addition, the coefficient value 0.683 tells that the log odds ratio for getting a bad credit rating with/without education as the credit purpose is 0.683, and the odds ratio of the two groups is e0.683=1.979. What would this mean, for example, in a group of 100 credit applicants, let’s say 20 of them with education as the purpose (group 1) and the remaining 80 with another purpose (group 2)? If 10 out of the 80 applicants in the group 2 have a bad credit rating, so their odds is 0.125, then according to the odds ratio 1.979, the odds for the group 1 must be ~2 times the odds of the group 2, so 0.25 in this case. Therefore 5 (a quarter) of the applicants in the group 1 must have a bad credit rating!
The coefficient statistics have a universal scale, and we can use them to compare the magnitude and the effect of different predictor columns. However, to understand the effect of a single predictor, the Delta-p statistics provide an easier way! Let’s take a look:
In Figure 4 you can see the Delta-p statistics and the intermediate results in calculating it, also shown below for the purpose=education variable:
Figure 4. Delta-p statistics, its intermediate results, and the corresponding coefficient statistics of a logistic regression model that predicts the credit rating good/bad of a credit applicant
The value 0.159 of the Delta-p statistics indicates that education as a credit purpose increases the probability of a bad credit rating by 15.9 % compared to an average credit application.
If we wanted to compare the effect to the opposite situation, i.e., the credit purpose is not education, instead of an average credit applicant, we would need to recalculate the prior probability and also mean-center the binomial values of the predictor column of interest xi. In our data, 5 % of the people apply the credit for education purposes, so the mean of the “purpose” column xiis 0.05 .
The value 0.158 of the Delta-p statistics indicates that the credit applied for education purposes increases the probability of a bad credit rating by 15.8 % compared to those who apply it for other purposes. There’s hardly any difference to the previous situation where we compared against an average applicant and obtained the Delta-p value 0.159 (Figure 4). This means that the credit applicants with other purposes than education are very close to the sample average in terms of their credit rating, apparently because they make up 95% of the total sample.
Now we know that applying credit for education purposes has a negative effect on the credit rating. Which column could have a positive effect? Let’s check the effect of the other dummy column that we created, the “checking” column that tells if the balance of the existing account is negative. The coefficient value of checking=some funds or no account is -1.063 with a p-value 0, as you can see in the first row in Figure 3.
As the Delta-p statistics -0.171 in the first row in Figure 4 show, credit applicants with no negative account balance tend to have a 17.1 % lower probability of a bad credit rating than an average credit applicant. Interestingly, we found two columns, purpose and checking, that have an effect of almost the same size but a different direction. If we look at the odds ratio of these two variables in Figure 4, we wouldn’t get the same information at first glance: The odds ratio is 0.345 for checking=some funds or no account and 1.979 for purpose=education.
In this article, we have introduced Delta-p statistics as a straightforward way of interpreting the coefficients of a logistic regression model. With Delta-p statistics, the predictions based on a logistic regression model are easy to understand by non-technical decision-makers.
In this article, we used Delta-p statistics to assess the individual effects that make a credit application succeed or fail. Of course, the use cases of Delta-p statistics are many more. For example, we could use it to determine the individual touchpoints that decrease or increase the customer satisfaction the most, or to find the symptoms with the highest relevance, when detecting a disease. Also notice that not always the whole process from raw data to model training and model evaluation need to be completed, Delta-p statistics can also be used to re-evaluate the coefficients of a previously trained logistic regression model.
Delta-p statistics can only be used to assess the individual effects of predictor columns in a logistic regression model. Logistic regression might not be the best choice when working with high dimensional data, with many correlated predictor columns, and columns not correlated with the target column. The target classes also need to be linearly separable in the feature space.
If you want to replicate the procedure described in the article, one option is to install the open source KNIME Analytics Platform on their laptops and download the KNIME workflow attached to the article for free. A visual representation of the workflow is available on the KNIME Hub without installing KNIME Analytics Platform. Other options are to implement the calculations in any another programming tool, or even perform them manually with a calculator.
About the Authors
Maarit Widmann is a data scientist at KNIME. She started with quantitative sociology and holds her Bachelor degree in social sciences. The University of Konstanz made her drop the “social” part when she completed her Master of Science! She now communicates concepts behind data science in videos and blog articles. Follow Maarit on LinkedIn.
Alfredo Roccato is an independent consultant and trainer with a focus on data science. He studied statistics at the Catholic University in Milan and has been serving companies with business intelligence and analytics for over 35 years. Follow Alfredo on LinkedIn.
After the current U.S. Congress was sworn in, a predictable chorus of merchants, lobbyists, and lawmakers demanded new interchange price caps and other government mandates to decrease credit card interchange fees for merchants. The tired attacks on credit cards are an easy narrative that focuses almost exclusively on the cost side of the ledger, while completely ignoring the cards’ important role in the economy and the regressive effects of interchange regulation.
To lawmakers blindly acting on behalf of retailers, regulation is a brilliant idea—regardless of how it affects their constituents. For decades, they have promised these interventions would eventually benefit consumers. But the lessons from the Durbin Amendment in the United States and price cap regulation in Australia is clear. Although some policymakers bemoan the current economic model, arbitrarily “cutting” rates for the sake of cuts completely ignores the economic reality that as billions of dollars move to merchants, billions are lost by consumers.
For the uninitiated, let’s break down what credit interchange funds: 1) the cost of fraud; 2) more than $40 billion in consumers rewards; 3) the cost of nonpayment by consumers, which is typically 4% of revolving credit; 4) more than $300 billion in credit floats to U.S. consumers; and 5) drastically higher “ticket lift” for merchants.
These are just some of the benefits. If costs were all that mattered, American Express wouldn’t exist. Until recently, it was by far the most expensive U.S. network. Yet, merchants still took AmEx because they knew the average AmEx “swipe” was around $140, far more than Visa and Mastercard.
Put simply, for a few basis points, interchange functions as a small insurance policy to safeguard retailers from the threat of fraud and nonpayment by consumers. Consider the amount of ink spilled on interchange when no one mentions that the chargeoff rate for issuing banks on bad credit card debt exceeds credit interchange.
Looking abroad, interchange opponents cite Australia, which halved interchange fees nearly 20 years ago, as a glowing example of how to regulate credit cards. In truth, Australia’s regulations have harmed consumers, reduced their options, and forced Australians to pay more for less appealing credit card products.
First, the cost of a basic credit card is $60 USD in many Australian banks. How many millions of Americans would lose access to credit if the annual cost went from $0 to $60? Can you imagine the consumer outrage?
In a two-sided market like credit cards, any regulated shift to one side acts a massive tax on the other. For Australians, the new tax fell on cardholders. There, annual fees for standard cards rose by nearly 25%, according to an analysis by global consulting firm CRA International. Fees for rewards cards skyrocketed by as much as 77%.
Many no-fee credit cards were no longer financially viable. As a result, they were pulled from the market, leaving lower income Australians, as well as young people working to establish credit, with few viable options in the credit card market.
Even the benefits that lead many people to sign up for credit cards in the first place have been substantially diluted in Australia because of the reduction of interchange fees. In fact, the value of rewards points fell by approximately 23% after the country cut interchange fees.
Efforts to add interchange price caps would have a similar effect here in the U.S. A 50% cut would amount to a $40 billion to $50 billion wealth transfer from consumers and issuers to merchants. For the 20 million or so financially marginalized Americans, what will their access to credit be when issuers find a $50 billion hole in their balance sheets?
The average American generates $167 per year in rewards, according to the Consumer Financial Protection Bureau. Perks like airline miles, hotel points, and cashback rewards would be decimated and would likely be just the province of the rich after regulation. Many middle-class consumers could say goodbye to family vacations booked at almost no cost thanks to credit card rewards.
As the travel industry and retailers fight to bounce back from the impact of the pandemic, slashing consumer rewards and reducing the attractiveness of already-fragile businesses is the last thing lawmakers and regulators in Washington should undertake.
Proposals to follow Australia’s misguided lead in capping interchange may allow retailers to snatch a few extra basis points, but the consequences would be disastrous for consumers. Cards would simply be less valuable and more expensive for Americans, and millions of consumers would lose access to credit. University of Pennsylvania Professor Natasha Sarin estimates debit price caps alone cost consumers $3 billion. How much more would consumers have to pay under Durbin 2.0?
Members of Congress and other leaders should learn from Australia and Durbin 1.0 to avoid making the same mistake twice.
—Drew Johnson is a senior fellow at the National Center for Public Policy Research, Washington, D.C.
More than ever before, your debt and credit records can negatively impact you or your family’s life if left unmanaged. Sadly, many Americans feel entirely helpless about their credit score’s present state and the steps they need to take to fix a less-than-perfect score. This is where Michael Carrington, founder of Tier 1 Credit Specialist, comes in. Michael is determined to offer thousands of Americans an educated, informed approach towards credit restoration.
Michael understands the plight that having a bad credit score can bring into your life. His first financial industry job was working as a home mortgage loan analyst for one of the nation’s largest lenders. Early on, he had to work a grueling schedule which included several jobs seven days a week while putting in almost 12-hour days to make $5,000 monthly to get by barely.
“I was tired of living a mediocre life and was determined to increase the value that I can offer others through my knowledge of the finance industry – I started reading all of the necessary books, networking with industry professionals, and investing in mentorship,” shares Michael Carrington. “I got my break when I was able to grow a seven-figure credit repair and funding organization that is flexible enough to address the financial needs of thousands of Americans.”
With his vast experience in the business world, establishing himself as a well-respected business leader, Michael Carrington felt he had the power to help millions of Americas in restoring their credit. Michael learned the FICO system, stayed up to date on the Fair Credit Reporting Act (FCRA), found ways to improve his credit score, and started showing others.
The Tier 1 Credit Specialist uses a tested and proven approach to educate their clients on everything credit scores. Michael is leveraging his experience as a home mortgage professional, marketing executive, and global business coach to inform his clients. He and his team take their time to carefully go through their client’s credit records as they try to find the root of their problem and find suitable financial solutions.
The company is changing lives all over America as it helps families and individuals to repair their credit scores, gain access to lower interest rates on loans and get better jobs. What Tier 1 Credit Specialists is offering many Americans is a chance at financial freedom.
Michael Carrington has repaired over $8 million in debt write-ups and has helped fund American’s with over $4 million through thousands of fixed reports. “I credit our success to being people-focused,” he often says. “The amount of success that we create is going to be in direct proportion to the amount of value that we provide people – not just our customers – people.”
Because of its ‘people-focused goals, the Tier 1 Credit Specialist is determined to help millions of Americans achieve financial literacy. It is currently receiving raving reviews from clients who are completely happy with the credit repair solutions that the company has provided them.
Today, Michael Carrington is continuing with a new initiative to serve more Americans who suffer from bad credit due to little or no access to affordable resources for repair.
The Tier 1 Credit Socialist brand is changing the outlook of many families across America. To do this, the company has created an affiliate system that will provide more people with ways of earning during these tough economic times.
As a well-respected international business leader and entrepreneur with numerous achievements to his name Michael Carrington aims to help millions of Americans achieve the financial freedom, he is experiencing today. Tier 1 Credit Socialist is one of the most effective credit repair brands on the market right now, and they have no plans for slowing down in 2021!
Learn more about Michael Carrington by visiting his Instagram account or checking out the Tier 1 Credit Specialist website.
When it comes to personal finance, nothing is guaranteed. That goes double for credit. That’s why, no matter how perfect your credit or how many times you’ve applied for a new credit card, there’s always that moment of doubt while you wait for a decision.
Issuing banks look at a wide range of factors when making a decision — and your credit score is only one of them. They look at your entire credit history, and consider things like your income and even your history with the bank itself.
For example, if you defaulted on a credit card with a given bank 15 years ago, that mistake is likely long gone from your credit reports. To you and the three major credit bureaus, it is ancient history. But banks are like elephants — they never forget. And that mistake could be enough to stop your approval.
But does it go the other way, too? Does having a bank account that’s in good standing with an issuer make you more likely to get approved? While there’s no clear-cut answer, there are a few cases when it could help.
A good relationship may weigh in your favor
Credit card issuers rarely come right out and say much about their approval processes, so we often have to rely on anecdotal evidence to get an idea of what works. That said, you can find a number of stories of folks who have been approved for a credit card they were previously denied for after they opened a savings or checking account with the issuer.
These types of stories are more common at the extreme ends of the card range. If you have a borderline bad credit score, for instance, having a long, positive banking history with the issuer — like no overdrafts or other problems — may weigh in your favor when applying for a credit card. That’s because the bank is able to see that you have regular income and don’t overspend.
Similarly, a healthy savings or investment account with a bank could be a helpful factor when applying for a high-end rewards credit card. This allows the bank to see that you can afford its product and that you have the type of funds required to put some serious spend on it.
Having a good banking relationship with an issuer can be particularly helpful when the economy is questionable and banks are tightening their proverbial pursestrings. When trying to minimize risk, going with applicants you’ve known for years simply makes more sense than starting fresh with a stranger.
Some banks provide targeted offers
Another way having a previous banking relationship with an issuer can help is when you can receive targeted credit card offers. These are sort of like invitations to apply for a card that the bank thinks will be a good fit for you. While approval for targeted offers is still not guaranteed, some types of targeted offers can be almost as good.
For example, the only confirmed way to get around Chase’s 5/24 rule (which is that any card application will be automatically denied if you’ve opened five or more cards in the last 24 months) is to receive a special “just for you” offer through your online Chase account. When these offers show up — they’re marked with a special black star — they will generally lead to an approval, no matter what your current 5/24 status.
Credit unions require membership
For the most part, you aren’t usually required to have a bank account with a particular issuer to get a credit card with that bank. However, there is one big exception: credit unions. Due to the different structure of a credit union vs. a bank, credit unions only offer their products to current members of the credit union.
To become a member, you need to actually have a stake in that credit union. In most cases, this is done by opening a savings account and maintaining a small balance — $5 is a common minimum.
You can only apply for a credit union credit card once you’ve joined, so a bank account is an actual requirement in this case. That said, your chances of being approved once you’re a member aren’t necessarily impacted by how much money you have in the account.
In general, while having a bank account with an issuer may be helpful in some cases, it’s not a cure-all for bad credit. Your credit history will always have more impact than your banking history when it comes to getting approved for a credit card.