Posted on April 4, 2013 @ 11:30:00 AM by Paul Meagher
This blog is a followup to yesterday's blog introducing the idea of Bayesian Angel Investing.
In 2004 I wrote 3 articles for IBM developerWorks on Bayesian inference and developed php-based code to explore the topic with. I'd like to follow up on some of that work by exploring how Bayesian inference might be applied to Angel Investing.
It is hard to pick a starting point for this investigation. I thought the best way to begin would be to give a quick demo of how to use a ClassifierDiagnostics.php class I developed to analyze the relationship between two binary-valued variables (a "test" variable and a "classification" variable). Doing so will introduce you to many concepts, calculations, and stats you should be familiar with if you want to apply Bayesian inference to Angel Investing.
The two variables we will be analyzing in the demo code below are "Business Plan Quality" test variable and a "Successful Company" classification variable. The data we will be inputting to our software for analysis will consist of a binary rating of Business Plan Quality (0=Fail, 1=Pass) and a binary rating for the Successful Company variable (0=Not Successful, 1=Successful). Each of the four $data records below corresponds to an observation conducted on one startup company. In this case, the observation of Business Plan Quality for a startup company and the eventual success or failure of that startup company. One question to investigate is whether the Business Plan Quality measurement should be used as a "test" for diagnosing whether a startup company will be successful or not.
Without further ado, here is the source code for the business_plan_and_success.php demo script which invokes input, analysis, and output functions supplied by the ClassifierDiagnostics.php class.
Below is the output generated by the running the demo script. The first set of tables below are the joint frequency and joint probability tables. Underneath these tables is displayed various diagnostic stats that can be used to assess the quality of your "test" variable (i.e., Business Plan Quality) in classifying a startup as being sucessful or not.
|
Successful Company |
Yes |
No |
Business Plan |
Pass |
2 (TP) |
0 (FP) |
Fail |
1 (FN) |
1 (TN) |
|
|
Successful Company |
Yes |
No |
Business Plan |
Pass |
0.67 (TP) |
0.00 (FP) |
Fail |
0.33 (FN) |
1.00 (TN) |
|
Test Sensitivity (TP) |
0.67 |
False Alarm Rate (FP) |
0.00 |
Miss Rate (FN) |
0.33 |
Test Specificity (TN) |
1.00 |
Base Rate |
0.75 |
P(+Test) |
0.50 |
P(-Test) |
0.50 |
P(+Class | +Test) |
1.00 |
P(-Class | +Test) |
0.00 |
P(+Class | -Test) |
0.50 |
P(-Class | -Test) |
0.50 |
Likelihood Ratio(+Test) |
0.00 |
Likelihood Ratio(-Test) |
0.33 |
Accuracy |
0.75 |
Gain |
1.33 |
I'll return to discussing some of the stats being reported here in a later blog. For now, I'd like to complete the technical part of the demo by showing you the source code for the ClassifierDiagnostics.php object. If you put the ClassifierDiagnostics.php object in the same php-enabled folder the as business_plan_and_success.php demo script, then point your browser at the demo script, you will see the output above.
|