This document explains the operations performed on the bank marketing dataset and shows representative example outputs (tables and charts).
Reads the dataset into a tabular structure and presents a preview so you can verify columns, sample values, and basic shape.
| age | job | marital | education | balance | y |
|---|---|---|---|---|---|
| 30 | admin. | married | university.degree | 1789 | no |
| 34 | technician | single | high.school | 0 | no |
| 47 | blue-collar | married | basic.9y | 1506 | yes |
| 22 | services | single | high.school | 0 | no |
| 58 | retired | married | illiterate | 214 | no |
Counts missing entries per column and reports each column's data type so you can plan cleaning and type conversions.
| column | missing |
|---|---|
| age | 0 |
| job | 0 |
| education | 2 |
| balance | 0 |
| y | 0 |
| column | type |
|---|---|
| age | integer |
| job | categorical |
| balance | float |
| y | categorical (target) |
Computes descriptive statistics for numeric and categorical fields (count, mean, std, top categories, unique counts).
| metric | age | balance |
|---|---|---|
| count | 4521 | 4521 |
| mean | 41.7 | 1362.3 |
| std | 10.2 | 3045.1 |
| min | 18 | -6847 |
| 25% | 33 | 71 |
| 50% | 39 | 448 |
| 75% | 50 | 1428 |
| max | 95 | 102127 |
Shows counts of the target classes (e.g., how many subscribed vs not). Useful to detect class imbalance and to plan sampling strategies.
| class | count | percent |
|---|---|---|
| no | 3985 | 88% |
| yes | 536 | 12% |
Computes Pearson correlations between numeric fields to reveal linear relationships and potential multicollinearity.
Interpretation notes: high positive values (red) indicate strong positive correlation; negative values (blue) indicate inverse relationships. Use this info to decide feature selection or regularization.