Analyzing the Airbnb dataset of Rio de Janeiro

Thiago Santos Figueira
Geek Culture
Published in
4 min readJun 2, 2021

--

Photo by Agustin Diaz Gargiulo on Unsplash

Millions of people rent rooms through Airbnb. Therefore, we might uncover some interesting findings when looking at the data they have. Luckily for us, there is an open-source, non-commercial data tool called Inside Airbnb, which gives us access to publicly available information provided by the renting platform.

There are a few natural questions that come to mind when looking at Airbnb’s datasets. I chose six of them to answer today:

  • What is the most popular neighborhood for renting a room?
  • What is the most popular room type?
  • What is the most expensive room type?
  • The most expensive rooms are located in which neighborhood?
  • What is the expected average price per neighborhood?

What is the most popular neighborhood for renting a room?

First, let us have a look at which neighborhood offers the highest number of rooms avaliables for rental.

Notice Copacabana is the neighborhood that has the highest number of room options available. This does not mean it is the most popular, though. Our dataset gives us the number of reviews each room received until March/2021. The rooms with the most reviews must be more popular, right?

Copacabana is the most popular neighborhood, followed by Ipanema. While Ipanema has slightly fewer rooms than Barra da Tijuca, it is featured on many national TV productions, making it a popular destination.

What is the most the popular room type?

As before, let us first look at the total number of available rooms per type.

We can look at the total number of reviews per room type to approximate the most popular rooms in Rio. Since the category entire home/apt is the most common type of room available, I would expect them to be the most rented.

We see rooms belonging to the category of entire apartments have been reviewed the most, which indicates they indeed are a popular option.

What is the most expensive room type?

If we organize our data to find the average price per room type, we get:

This is interesting! It seems that visitors pay, on average, more for a shared room than for an entire home/apt, which is unexpected. However, I suspect there might be outliers. Let us visualize this data in a boxplot.

The boxplot is a graph that allows us to quickly compare some statistics between groups. In the image above, we have two boxplots. Notice, on the left, that shared rooms have an outlier that is distorting its mean. The image on the right lets us visualize that shared rooms have, in fact, lower prices when compared to other types of rooms. Therefore, when visiting Rio de Janeiro, we would expect to pay more to rent an entire home or apartment.

The most expensive rooms are located in which neighborhood?

We group the data to find the average price for each neighborhood.

As before, the outlier might be affecting the average price mean for each neighborhood. Let us organize the data according to the median.

The median shows different information when compared to the previous image. However, notice prices in Joá appear on the top five neighborhoods in both graphs, which shows it indeed has higher prices according to both mean and median.

What is the expected average price per neighborhood?

After calculating the mean price per neighborhood, we can visualize the data on a map. We can place markers showing the information we want.

Summary

  • What is the most popular neighborhood for renting a room?: Ipanema
  • What is the most expensive room type?: entire home/apts
  • What is the most popular room type? entire home/apts
  • The most expensive rooms are located in which neighborhood? Joá is the most expensive neighborhood in Rio. We would expect to pay, on average, 3926.83 per room. Half of the rooms (median) are above the 2277.5 price line; half are below.

You can find the code available in a Jupyter Notebook here. Thank you for reading.

Photo by Marco Bianchetti on Unsplash

--

--