I wanted a reason to mess around with the Python visualization libraries matplotlib and bokeh, so I decided to look into some information on the TV show Survivor. The results are largely unremarkable, but I’ve created pretty nice-looking graphs (at least compared to what the libraries make when left to their own devices) so I thought I’d show them. Note that I chose day 20 as a common cut-off because merges usually happen around that time.

I may add additional graphs intermittently once I get some more time to play around. I’ll also return to add usable Python2 functions for each graph once I clean them up and make them adaptable to different data. Data are from here (through Survivor: Cambodia).

**Age Distribution of Contestants**

*Blue = All Contestants, Green = Day 20 Contestants, Red = Finalists*

*See [1] below for relevant code.*

**Age Breakdown by Days Lasted**

*See [2] below for relevant code.*

**Regional Distribution of Contestants **

**Competitions Won by Finish Position**

**Finish Position (20=Last, 1=Winner)**

*Blue = Average, Light Blue = Maximum, Orange = Minimum*

**Performance vs. Average Votes Against**

*The radius of the bubbles is proportional to the number of contestants who fit that bin of days.*

### Appendix

Functions of variations of the above charts are provided below. Note that some elements were sacrificed to make them intelligible, adaptable and relatively simple.

[2]

```
import numpy as np
import pandas as pd
from bokeh.plotting import figure, show, output_file
def circle_graph(df, index, list_, unit):
"""Takes a pandas df with an index column and a list of columns (3 to 5 for
best results) with float values representing the unit and creates a circle graph"""
"""e.g. circle_graph(survivor_df, 'age group', ['15 to 24 y.o.', '25 to 34 y.o.', '35+ y.o.'], '%')"""
#Set the colors of the bars in the bar graph based on "Tableau 20" colors.
tableau20 = [ '#1f77b4', '#2ca02c','#7f7f7f','#8c564b','#d62728', '#bcbd22', '#9edae5','#17becf',
'#98df8a','#9467bd','#aec7e8','#c5b0d5', '#c7c7c7']
width = 800
height = 800
inner_radius = 90
outer_radius = 290
big_angle = 2.0 * np.pi / (len(df) + 1)
small_angle = big_angle / (len(list_) *2 + 1)
p = figure(plot_width=width, plot_height=height, title="", x_axis_type=None, y_axis_type=None,
x_range=(-420, 420), y_range=(-420, 420), min_border=0, outline_line_color="white")
p.xgrid.grid_line_color = None
p.ygrid.grid_line_color = None
#draw the large wedges
angles = np.pi/2 - big_angle/2 - df.index.to_series()*big_angle
p.annular_wedge(0, 0, inner_radius, outer_radius, -big_angle + angles, angles, color='#cfcfcf',)
#find the maximum value for the concentric rings of values
counter = 0
column_max = list()
for column in list_:
column_max.append(max(df[column]))
counter += 1
label_max = int(max(column_max) + 10 - max(column_max)%10)
#draw the small wedges and labels
bar_color = {}
counter = 0
for column in list_:
bar_color[column] = tableau20[counter]
p.annular_wedge(0, 0, inner_radius, 90 + df[column]*(200/float(label_max)),
-big_angle+angles+(2*counter+1)*small_angle, -big_angle+angles+(2*counter+2)*small_angle,
color=bar_color[column])
p.rect([-40, -40, -40], [37-counter*18], width=30, height=13, color=bar_color[column])
p.text([-15, -15, -15], [37-counter*18], text=[column], text_font_size="9pt", text_align="left", text_baseline="middle")
counter += 1
#draw the rings and corresponding labels
labels = np.array(range((label_max+10)/10))*10
radii = 90 + labels* (200/float(label_max))
p.circle(0, 0, radius=radii[:-1], fill_color=None, line_color="#E6E6E6")
p.text(0, radii, [str(z)+str(unit) for z in labels[:-1]], text_font_size="8pt", text_align="center", text_baseline="middle")
#draw the spokes separating
p.annular_wedge(0, 0, inner_radius-10, outer_radius+10, -big_angle+angles, -big_angle+angles, color="black")
xr = radii[-1]*np.cos(np.array(-big_angle/2 + angles))
yr = radii[-1]*np.sin(np.array(-big_angle/2 + angles))
label_angle=np.array(-big_angle/2+angles)
label_angle[label_angle < -np.pi/2] += np.pi
p.text(xr, yr, df[index], angle=label_angle, text_font_size="9pt", text_align="center", text_baseline="middle")
output_file("example.html", title="example.py")
show(p)
test_df = pd.read_csv(r"C:\Users\Zachery McKinnon\Documents\survivor_demographics_agexdayslasted1.csv")
```

Hi Zachery,

I was looking at the article “things I’ve worked on: survivo by the number” (https://zacherymckinnon.com/2016/09/26/things-ive-worked-on-survivor-by-the-numbers/).

I was no at able to find the dataset so I was no able to try the Pyhton Code. I looked also at https://survivor.fandom.com/wiki/List_of_Survivor_contestants but

I was no able to find the dataset. Could you please send me or tell me how to get it.

I tried also different type of dataset but I was not able to make one working.

What is record format I have to prepare?

I get also an error with: [’15 to 24 y.o.’, ’25 to 34 y.o.’, ’35+ y.o.’]

KeyError: ’15 to 24 y.o.’

Any ideas?

Thanks,

Marco

LikeLike