import numpy as np, pandas as pd, seaborn as sns, scipy.stats as st, math
from matplotlib import pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

a_data = pd.read_csv('amazon_bestsellers.csv')
a_data.index += 1
a_data.rename(columns = {'Name'        : 'name',
                         'Author'      : 'author',
                         'User Rating' : 'rating',
                         'Reviews'     : 'reviews',
                         'Price'       : 'price',
                         'Year'        : 'year',
                         'Genre'       : 'genre'}, inplace = True)

Analysis of Book Data from Amazon

Clayton Durkin, Hyen-Tyae Jeong, Thomas Geisler

The internet has allowed consumers to easily rate many different kinds of products and services including restaurants, Uber drivers, pop albums, and hotel rooms. These ratings allow other consumers to determine whether or not to pay a particular price for these products and services. During the current COVID-19 pandemic, the combination of more free time and having to spend much of that extra free time at home has resulted in many people reading more books than they previously have. A recent survey by Global English Editing suggests that 35% of people worldwide have read more books than usual over the past year while 14% have read significantly more than usual (https://geediting.com/world-reading-habits-2020/). Accordingly, it has become more important for people to be able to accurately determine whether they will enjoy a book before committing to buy it.

In this project, we will analyze open-source data about the top 50 bestselling books on Amazon every year from 2009 - 2019. We want to investigate the relationship between book rating and book price as well how prices and ratings have changed over the relevant timeframe.

We restrict our data to the top 50 bestselling books to eliminate outlier data consisting of many books that are poorly rated. Our data set includes book metadata (name, author, genre, year), average user rating, how many reviews created that rating, and the price of that book on Amazon. Genre is actually a simplified category identifying each book as either fiction or non-fiction. Lastly, some of the authors in the dataset are not individuals but are organizations that have written and published books, such as the American Psychological Association. Below is a sample of the dataset.

a_data.head()

Part 1: Price Versus Rating

First, we will try to determine if there is any correlation between rating and price. We hypothesize that more highly rated books would be in higher demand and would thus demand higher prices. Here is a violin plot of book price versus book rating.

fig, ax = plt.subplots(figsize=(14,8.65))
ax = sns.violinplot(x="rating", y="price", data=a_data, color='lightseagreen')
plt.xlabel("User Rating", size=14)
plt.ylabel("Price ($)", size=14)
plt.title("Top 50 Amazon Book Prices vs. Ratings (2009 - 2019)", size=16)
plt.show()

The violin plot does not actually confirm or deny our hypothesis. To determine whether a correlation exists between price and rating, we'll perform regression analysis. We'll also go ahead and make a scatter plot of the data with the regression line.

X = a_data['rating'].values
y = a_data['price'].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
model = LinearRegression().fit(X_train.reshape(-1,1), y_train)
coef50 = round(model.coef_[0], 3)
intercept50 = round(model.intercept_, 3)
print('Linear regression model given by:')
print('y = {}x'.format(coef50), end='')
if (intercept50 < 0):
    print(' - {}'.format(abs(intercept50)))
else:
    print(' + {}'.format(intercept50))
fig, ax = plt.subplots(figsize=(14,8.65))
plt.xlabel("User Rating", size=14)
plt.ylabel("Price ($)", size=14)
plt.title("Top 50 Amazon Book Prices vs. Ratings (2009 - 2019) w/ Regression", size=16)
ax.scatter(X,y)
x_lin = np.linspace(np.amin(X), np.amax(X), 100)
y_lin = intercept50 + (coef50 * x_lin)
plt.plot(x_lin, y_lin, color='red')
plt.show()

Linear regression model given by:
y = -5.64x + 38.734

So, our hypothesis was wrong! Even though there is a correlation between rating and price, it is actually in the opposite direction of what we expected. For each 1-point increase in average user rating, book prices decrease by \$5.64.

A convenient consequence of this graphic is that we can quickly determine which books are good deals. Data points located below the regression line represent books that have a lower-than-expected price given their rating while points above the line represent books with a higher-than-expected price given their rating. Therefore, books below the line can be considered bargains while those above the line can be considered overpriced.

Part 2A: Price versus Year

Even though our data only covers 10 years, it may be worthwhile to look at how the top 50 book prices changed over those ten years. While it is unlikely that any drastic change in price occurred during that time period, it's possible that the recovery from the '07-'08 financial crisis may have caused prices to increase slightly. Again, we'll plot the data using a violin plot and then calculate a linear regression model.

fig, ax = plt.subplots(figsize=(14,8.65))
ax = ax = sns.violinplot(x="year", y="price", data=a_data, color='lightseagreen')
plt.xlabel("User Rating", size=14)
plt.ylabel("Price ($)", size=14)
plt.title("Top 50 Amazon Book Prices vs. Ratings (2009 - 2019)", size=16)
plt.show()
X = a_data['year'].values
y = a_data['price'].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
model = LinearRegression().fit(X_train.reshape(-1,1), y_train)
coef = round(model.coef_[0], 3)
intercept = round(model.intercept_, 3)
print('Linear regression model given by:')
print('y = {}x'.format(coef), end='')
if (intercept < 0):
    print(' - {}'.format(abs(intercept)))
else:
    print(' + {}'.format(intercept))

Linear regression model given by:
y = -0.38x + 778.358

So, as we predicted, prices changed very little over the ten years covered by our dataset. However, instead of a slight increase, there was actually a slight decrease of \$0.38 per year. However, that amount is not very significant considering that book prices range from \$0 to \$105.

Part 2B: Price versus Number of Reviews

We also hypothesize that people would be less critical of a book that they paid less money for. In other words, if I pay a lot of money for a book it would have to be amazing for me to leave a 5-star review. However, I might leave a 5-star review for a cheap book that was just ok. To test for this, we are going to compare the number of reviews a book receives versus its price, and the overall rating versus the number of reviews. The idea is that cheap books would be bought more often and thus reviewed more, and books with more reviews would be higher rated indicating that cheap books receive a higher rating.

reviews_df = a_data.copy(deep=True)

for index, row in reviews_df.iterrows():
    temp = reviews_df.loc[index]['reviews']
    temp = temp/1000
    reviews_df.at[index, 'reviews'] = temp
    
fig, ax = plt.subplots(figsize=(14,8.65))
ax = sns.violinplot(x="reviews", y="price", data=reviews_df, color='lightseagreen')
plt.xlabel("Number of Reviews (in 1000s)", size=14)
plt.ylabel("Price ($)", size=14)
plt.title("Top 50 Amazon Book Prices vs. Number of Reviews (2009 - 2019)", size=16)
plt.show()

Based on this violin plot it does not look like the price of the book has a huge effect on the number of reviews a book receives. However, just to confirm, we will also do some regression analysis.

X = reviews_df['reviews'].values
y = reviews_df['price'].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
model = LinearRegression().fit(X_train.reshape(-1,1), y_train)
coef_count1 = round(model.coef_[0], 3)
intercept_count1 = round(model.intercept_, 3)
print('Linear regression model given by:')
print('y = {}x'.format(coef_count1), end='')
if (intercept_count1 < 0):
    print(' - {}'.format(abs(intercept_count1)))
else:
    print(' + {}'.format(intercept_count1))
fig, ax = plt.subplots(figsize=(14,8.65))
plt.xlabel("Number of Reviews (in 1000s)", size=14)
plt.ylabel("Price ($)", size=14)
plt.title("Top 50 Amazon Book Prices vs. Number of Reviews (2009 - 2019) w/ Regression", size=16)
ax.scatter(X,y)
x_lin = np.linspace(np.amin(X), np.amax(X), 100)
y_lin = intercept_count1 + (coef_count1 * x_lin)
plt.plot(x_lin, y_lin, color='red')
plt.show()

Linear regression model given by:
y = -0.078x + 13.579

This shows that there is only a very slight, if any correlation between the number of reviews a book receives and the price of the book. This is alone is enough to disprove the hypothesis, but just for curiosity, we want to check if there is any correlation between the rating of the book and the number of reviews it receives.

fig, ax = plt.subplots(figsize=(14,8.65))
ax = sns.violinplot(x="reviews", y="rating", data=reviews_df, color='lightseagreen')
plt.xlabel("Number of Reviews (in 1000s)", size=14)
plt.ylabel("Rating", size=14)
plt.title("Top 50 Amazon Book Rating vs. Number of Reviews (2009 - 2019)", size=16)
plt.show()

X = reviews_df['rating'].values
y = reviews_df['reviews'].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
model = LinearRegression().fit(X_train.reshape(-1,1), y_train)
coef_count2 = round(model.coef_[0], 3)
intercept_count2 = round(model.intercept_, 3)
print('Linear regression model given by:')
print('y = {}x'.format(coef_count2), end='')
if (intercept_count2 < 0):
    print(' - {}'.format(abs(intercept_count)))
else:
    print(' + {}'.format(intercept_count2))
fig, ax = plt.subplots(figsize=(14,8.65))
plt.xlabel("Rating", size=14)
plt.ylabel("Number of Reviews (in 1000s)", size=14)
plt.title("Top 50 Amazon Book Ratings vs. Number of Reviews (2009 - 2019) w/ Regression", size=16)
ax.scatter(X,y)
x_lin = np.linspace(np.amin(X), np.amax(X), 100)
y_lin = intercept_count2 + (coef_count2 * x_lin)
plt.plot(x_lin, y_lin, color='red')
plt.show()

Linear regression model given by:
y = -1.286x + 17.572

Surprisingly, there also doesn't appear to be much of a correlation between the number of views that a book receives and the rating that it gets. This completely shatters he hypothesis that cheaper books would tend to get more reviews and thus a higher rating.

Part 3: Best Rated and Most Expensive Authors

To determine who the best rated and most/least expensive authors are, we will narrow our dataset to include only the top 20 authors based on how many books they have had on the top 50 lists for the given time period. This ensures that authors with few data points do not pollute our data. To make this easy, we add a new column which represents the number of top 50 appearances for the author on a given row. Then we create a new data frame of the top 20 authors based on that metric. For this dataset, that is equivalent to the authors who have had 6 or more appearances on the top 50 lists.

Once we have this new DataFrame, we will create yet another DataFrame containing only three columns: each author from the top 20 list, the average rating for that author, and the average book price for that author.

a_data['count'] = 0
for index,row in a_data.iterrows():
    a_data.at[index, 'count'] = (a_data['author']==row['author']).sum()
top20 = a_data[a_data['count'] >= 6]
top20 = top20.drop(columns=['reviews', 'year', 'count'])
top20_avgs = top20.groupby('author').mean()
top20_avgs.rename(columns={'rating': 'avg_rating', 'price': 'avg_price'}, inplace=True)
top20_avgs

The simplicity of this DataFrame will make it easy to work with. Similar to what we did in Part 1, we will now plot the average price versus the average rating and create a regression model.

X = top20_avgs['avg_rating'].values
y = top20_avgs['avg_price'].values
labels = top20_avgs.index.values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
model = LinearRegression().fit(X_train.reshape(-1,1), y_train)
coef20 = round(model.coef_[0], 3)
intercept20 = round(model.intercept_, 3)
print('Linear regression model given by:')
print('y = {}x'.format(coef20), end='')
if (intercept20 < 0):
    print(' - {}'.format(abs(intercept20)))
else:
    print(' + {}'.format(intercept20))
fig, ax = plt.subplots(figsize=(14,8.65))
plt.xlabel("Average Rating", size=14)
plt.ylabel("Average Price ($)", size=14)
plt.title("Top 20 Amazon Author Average Book Prices vs. Average Ratings (2009 - 2019) w/ Regression", size=16)
ax.scatter(X,y)
for index,author in enumerate(labels):
    ax.annotate(author, (X[index] + 0.01, y[index]))
x_lin = np.linspace(np.amin(X), np.amax(X), 100)
y_lin = intercept20 + (coef20 * x_lin)
plt.plot(x_lin, y_lin, color='red')
plt.show()

Linear regression model given by:
y = -16.652x + 90.232

There is a little bit to unpack with this graphic. First, given our axes, authors are sorted by rating along the x-axis and are sorted by price along the y-axis. Thus, the best and worst authors for both categories stick out right off the bat:

Most expensive author: American Psychological Association
Least expensive author: Rob Elliot
Highest rated author: Sarah Young, Dav Pilkey, Eric Carle (Tie)
Lowest rated author: Gallup

The American Psychological Association and Gallup are traditional individual authors, so lets take a look at what they're actually publishing to determine what makes them the most expensive and least rated authors respectively.

print('American Psychological Association:')
for book in a_data.loc[a_data['author'] == 'American Psychological Association']['name'].unique():
    print(book)
print('\nGallup:')
for book in a_data.loc[a_data['author'] == 'Gallup']['name'].unique():
    print(book)

American Psychological Association:
Publication Manual of the American Psychological Association, 6th Edition

Gallup:
StrengthsFinder 2.0

A quick web search reveals that the APA publication is style manual often used for professional bibliographies and citations while the Gallup publication is a management book used to determine the reader's professional strengths.

Our graphic also confirms something we discovered earlier, which is the existence of an inverse correlation between rating and price. Here, however, that correlation is even more pronounced. For every 1 unit increase in average rating, there is an expected decrease in average price of $16.65.

Lastly, like we showed in Part 1, given their average rating, authors above the regression line are more expensive than expected and those below are less expensive than expected. If we calculate the residuals for each author, we can order them by residual and determine the least expensive authors given their ratings.

top20_avgs['res'] = 0.0
for index,row in top20_avgs.iterrows():
    top20_avgs.at[index, 'res'] = row['avg_price'] - ((coef20 * row['avg_rating']) + intercept20)
top20_avgs.sort_values(by='res')

This resulting DataFrame shows us, for those who have appeared on Amazon's Top 50 list more than 6 times, the authors whose average price is lowest given their average rating.

Part 4: Best Books for the Price

The method used to determine the best priced authors given the rating can also be used to determine the best price books given the rating. Since we have already calculated a linear model for price versus rating for all of the books in our dataset, we can just add a column for residuals to the DataFrame and calculate.

a_data['res'] = 0.0
for index,row in a_data.iterrows():
    a_data.at[index, 'res'] = row['price'] - ((coef50 * row['rating']) + intercept50)
a_data.sort_values(by='res')

The resulting DataFrame tells us that the best priced book in our dataset is Disney's Journey to the Ice Palace while the worst priced book is yet another publication from the APA.

Part 5: Comparing Regression Between Datasets

Lastly, we want to check to see how our regressions models compares to models based on other datasets. To do this, we will first import a dataset gathered from Google Books. Like the Amazon data, the Google Books data lists book prices as well as an average rating for each book on a scale from 1 to 5. Since we only need price and rating data, we are going to remove most of the other columns.

One important thing to note is that the original dataset lists prices in Saudi Arabian Riyals (SAR). We will convert prices to US Dollars (USD) using the rate of 1 SAR to 0.27 USD. This rate was acquired by Morningstar for Currency and Coinbase for Cryptocurrency on December 16, 2020 at 15:37 UTC.

g_data = pd.read_csv('google_books.csv')
g_data.index += 1
g_data = g_data.drop(columns=['Unnamed: 0', 'description', 'publisher', 'page_count', 'generes', 'ISBN', 'language'])
g_data['price'] *= 0.27
g_data.dropna(subset=['rating', 'price'], inplace=True)
g_data.head()

Like we did with the Amazon data, we will create a linear regression model and determine the slope and intercept of the regression line. We are also going to change variable names so that they are more in line with the math we are going to execute.

b1 = coef50
X1 = a_data['rating'].values
Y1 = a_data['price'].values
a_intercept = intercept50

X2 = g_data['rating'].values
Y2 = g_data['price'].values
X_train, X_test, y_train, y_test = train_test_split(X2, Y2, test_size=0.2, random_state=0)
model = LinearRegression().fit(X_train.reshape(-1,1), y_train)
b2 = round(model.coef_[0], 3)
g_intercept = round(model.intercept_, 3)
print('Linear regression model given by:')
print('y = {}x'.format(b2), end='')
if (g_intercept < 0):
    print(' - {}'.format(abs(g_intercept)))
else:
    print(' + {}'.format(g_intercept))
fig, ax = plt.subplots(figsize=(14,8.65))
plt.xlabel("User Rating", size=14)
plt.ylabel("Price ($)", size=14)
plt.title("Google Book Prices vs. Ratings w/ Regression", size=16)
ax.scatter(X,y)
x_lin = np.linspace(np.amin(X), np.amax(X), 100)
y_lin = g_intercept + (b2 * x_lin)
plt.plot(x_lin, y_lin, color='red')
plt.show()

Linear regression model given by:
y = -6.82x + 41.844

The Google data and regression look very similar to those that we saw from the Amazon data. However, the slopes (-5.64 for Amazon and -6.82 for Google) are a little off. To determine whether the true slope of the books on the Amazon Top 50 list is indeed less than that of the Google Books list, we will perform a two-sample hypothesis test for slopes. This will be conducted in 4 steps: establishing the null and alternate hypotheses, calculating the test statistic, calculating the p-value, and comparing the p-value to our significance level. We will use a fairly standard significance level of α = 0.05.

We will let β₁ and β₂ be the true correlation coefficients for the Amazon and Google datasets respectively while b₁ and b₂ will be the respective sample coefficients. Since b₁ = -5.64 > -6.82 = b₂, we will test the alternate hypothesis that β₁ > β₂. Thus, our null and alternate hypotheses are as follows:
H₀: β₁ - β₂ = 0
H_a: β₁ - β₂ > 0

Next, we will calculate our test statistic using the formula

We already have the terms in the numerator. The terms in the square root in the denominator can be found via

# Calculate SEb1
m = len(a_data['rating'])
x1_bar = X1.sum() / m
SEb1_num = ((Y1 - (b1*Y1 + a_intercept))**2).sum()
SEb1_den = (m - 2) * ((X1 - x1_bar)**2).sum()
SEb1 = SEb1_num / SEb1_den

# Calculate SEb2
n = len(g_data['rating'])
x2_bar = X2.sum() / n
SEb2_num = ((Y2 - (b2*Y2 + g_intercept))**2).sum()
SEb2_den = (n - 2) * ((X2 - x2_bar)**2).sum()
SEb2 = SEb2_num / SEb2_den
SEb2

# Calculate test statistic
z_cal = (b1 - b2) / math.sqrt(SEb1 + SEb2)
z_cal

0.0671105598681493

Thus, our test statistic is z_cal = 0.0671

The next step is to use our test statistic to find our p-value. This step is relatively simple since SciPy has built-in tables for z-critical values. For our alternate hypothesis β₁ - β₂ > 0, the p-value is given by

p_val = 1 - st.norm.cdf(z_cal)
p_val

0.4732468436452486

Thus, we have p-value = 0.4732.

The last thing we must do is compare our p-value to our significance level of 0.05. We therefore have

Since our p-value is greater than our level of significance, we fail to reject the null hypothesis. Given our sample data, there is insufficient evidence to suggest that the inverse correlation between price and rating of books on Amazon's Top 50 bestsellers list is greater than that for books on Google Books.

Conclusion

To summarize, the analysis of our datasets resulted in a number of conclusions that help to clarify various relationships between book prices, ratings, the number of reviewers, time, and even how different datasets relate to one another. The main takeaways from our analysis are:

There is a clear negative correlation between book price and book rating. I.e., for every 1-point increase in rating, the expected price of a book decreases by $5.64.
For the time period 2009 - 2019, book prices actually decreased slightly over time by about 38¢ per year.
People do not tend to review books differently based on cost. Expensive books do not seem to be held to a different standard than cheap ones.
The best authors for the price are Rob Elliot and Harper Lee while the worst are the American Psychological Association and The College Board.
The best book for the price is Journey to the Ice Palace by RH Disney.
Despite a naive analysis of linear regression suggesting otherwise, the negative correlation between book price and book rating for the Google dataset is not any more pronounced than that for the Amazon dataset.

	name	author	rating	reviews	price	year	genre
1	10-Day Green Smoothie Cleanse	JJ Smith	4.7	17350	8	2016	Non Fiction
2	11/22/63: A Novel	Stephen King	4.6	2052	22	2011	Fiction
3	12 Rules for Life: An Antidote to Chaos	Jordan B. Peterson	4.7	18979	15	2018	Non Fiction
4	1984 (Signet Classics)	George Orwell	4.7	21424	6	2017	Fiction
5	5,000 Awesome Facts (About Everything!) (Natio...	National Geographic Kids	4.8	7665	12	2019	Non Fiction

	avg_rating	avg_price
author
American Psychological Association	4.500000	46.000000
Bill O'Reilly	4.642857	10.571429
Dav Pilkey	4.900000	6.285714
Don Miguel Ruiz	4.700000	6.000000
Dr. Seuss	4.877778	8.666667
E L James	4.233333	15.333333
Eric Carle	4.900000	5.000000
Gallup	4.000000	17.000000
Gary Chapman	4.736364	17.181818
Harper Lee	4.600000	4.333333
J.K. Rowling	4.450000	20.166667
Jeff Kinney	4.800000	9.250000
Rick Riordan	4.772727	9.909091
Rob Elliott	4.562500	4.000000
Sarah Young	4.900000	8.000000
Stephen R. Covey	4.642857	20.571429
Stephenie Meyer	4.657143	19.857143
Stieg Larsson	4.600000	9.500000
Suzanne Collins	4.663636	13.363636
The College Board	4.383333	39.333333

	avg_rating	avg_price	res
author
Rob Elliott	4.562500	4.000000	-10.257250
Harper Lee	4.600000	4.333333	-9.299467
Gallup	4.000000	17.000000	-6.624000
Don Miguel Ruiz	4.700000	6.000000	-5.967600
E L James	4.233333	15.333333	-4.405200
Stieg Larsson	4.600000	9.500000	-4.132800
Eric Carle	4.900000	5.000000	-3.637200
Dav Pilkey	4.900000	6.285714	-2.351486
Bill O'Reilly	4.642857	10.571429	-2.347714
Jeff Kinney	4.800000	9.250000	-1.052400
Rick Riordan	4.772727	9.909091	-0.847455
Sarah Young	4.900000	8.000000	-0.637200
Dr. Seuss	4.877778	8.666667	-0.340578
Suzanne Collins	4.663636	13.363636	0.790509
J.K. Rowling	4.450000	20.166667	4.036067
Gary Chapman	4.736364	17.181818	5.819745
Stephenie Meyer	4.657143	19.857143	7.175886
Stephen R. Covey	4.642857	20.571429	7.652286
The College Board	4.383333	39.333333	22.092600
American Psychological Association	4.500000	46.000000	30.702000

	name	author	rating	reviews	price	year	genre	count	res
194	JOURNEY TO THE ICE P	RH Disney	4.6	978	0	2014	Fiction	2	-12.790
462	The Short Second Life of Bree Tanner: An Eclip...	Stephenie Meyer	4.6	2122	0	2010	Fiction	7	-12.790
92	Eat This Not That! Supermarket Survival Guide:...	David Zinczenko	4.5	720	1	2009	Non Fiction	2	-12.354
117	Frozen (Little Golden Book)	RH Disney	4.7	3642	0	2014	Fiction	2	-12.226
390	The Girl with the Dragon Tattoo (Millennium Se...	Stieg Larsson	4.4	10559	2	2010	Fiction	6	-11.918
...	...	...	...	...	...	...	...	...	...
347	The Book of Basketball: The NBA According to T...	Bill Simmons	4.7	858	53	2009	Non Fiction	1	40.774
152	Hamilton: The Revolution	Lin-Manuel Miranda	4.9	5867	54	2016	Non Fiction	1	42.902
474	The Twilight Saga Collection	Stephenie Meyer	4.7	3801	82	2009	Fiction	7	69.774
70	Diagnostic and Statistical Manual of Mental Di...	American Psychiatric Association	4.5	6679	105	2013	Non Fiction	2	91.646
71	Diagnostic and Statistical Manual of Mental Di...	American Psychiatric Association	4.5	6679	105	2014	Non Fiction	2	91.646

	title	author	rating	voters	price	currency	published_date
1	Attack on Titan: Volume 13	Hajime Isayama	4.6	428	11.6856	SAR	Jul 31, 2014
2	Antiques Roadkill: A Trash 'n' Treasures Mystery	Barbara Allan	3.3	23	7.0605	SAR	Jul 1, 2007
3	The Art of Super Mario Odyssey	Nintendo	3.9	9	36.1395	SAR	Nov 5, 2019
4	Getting Away Is Deadly: An Ellie Avery Mystery	Sara Rosett	4.0	10	7.0605	SAR	Mar 1, 2009
5	The Painted Man (The Demon Cycle, Book 1)	Peter V. Brett	4.5	577	7.7058	SAR	Jan 8, 2009