Hotel & Travel

How Python and BeautifulSoup are Used to Scrape Hotel Listings from Booking.com?

iwebscraping

5 min read

September 22, 2021

Scraping hotel listings from numerous websites is one of the most common uses of Web Scraping. This might be done by keeping an eye on rates, creating an aggregator, or improving the user experience on existing hotel booking services.

This can be accomplished with the help of a simple script. We’ll utilize BeautifulSoup to assist us to extract data, and we’ll use Booking.com to find hotel information.

To begin, we’ll need these lines of code to retrieve the Booking.com search results page and set up BeautifulSoup to assist us query the page for meaningful data using CSS selectors.

# -*- coding: utf-8 -*-
from bs4 import BeautifulSoup
import requests

headers = {'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9'}
url = 'https://www.booking.com/searchresults.html?label=gen173nr-1FCAEoggI46AdIM1gEaGyIAQGYATG4AQfIAQzYAQHoAQH4AQKIAgGoAgO4AvTIm_IFwAIB;sid=7101b3fb6caa095b7b974488df1521d2;city=-2109472;from_idr=1&;dr_ps=IDR;ilp=1;d_dcp=1'

response=requests.get(url,headers=headers)

soup=BeautifulSoup(response.content,'lxml')

To avoid being blacklisted, we also pass the user agent headers to simulate a browser call.

Now let’s look at the Booking.com search engine results for a certain destination. This is how it appears to be.

When we examine the page, we notice that each item’s HTML is contained within a tag with the class sr_property_block.

We could simply use this to divide the Html page into these pieces, each of which has information about a single object, such as this:

# -*- coding: utf-8 -*-

from bs4 import BeautifulSoup

import requests

headers = {'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9'}

url = 'https://www.booking.com/searchresults.html?label=gen173nr-1FCAEoggI46AdIM1gEaGyIAQGYATG4AQfIAQzYAQHoAQH4AQKIAgGoAgO4AvTIm_IFwAIB&sid=eae1a774e77c394c5e69703d37e033a3&sb=1&src=searchresults&src_elem=sb&error_url=https://www.booking.com/searchresults.html?label=gen173nr-1FCAEoggI46AdIM1gEaGyIAQGYATG4AQfIAQzYAQHoAQH4AQKIAgGoAgO4AvTIm_IFwAIB;sid=eae1a774e77c394c5e69703d37e033a3;tmpl=searchresults;city=-2109472;class_interval=1;dest_id=-2109472;dest_type=city;dr_ps=IDR;dtdisc=0;from_idr=1;ilp=1;inac=0;index_postcard=0;label_click=undef;offset=0;postcard=0;room1=A%2CA;sb_price_type=total;shw_aparth=1;slp_r_match=0;srpvid=7df1609ef03a0103;ss_all=0;ssb=empty;sshis=0;top_ufis=1&;&sr_autoscroll=1&ss=Rishīkesh&is_ski_area=0&ssne=Rishīkesh&ssne_untouched=Rishīkesh&city=-2109472&checkin_year=2020&checkin_month=3&checkin_monthday=4&checkout_year=2020&checkout_month=3&checkout_monthday=5&group_adults=2&group_children=0&no_rooms=1&from_sf=1'


response=requests.get(url,headers=headers)
soup=BeautifulSoup(response.content,'lxml')
#print(soup.select('.a-carousel-card')[0].get_text())

for item in soup.select('.sr_property_block'):
try:
print('----------------------------------------')

print('----------------------------------------')

except Exception as e:
#raise e
print('')

When you execute it

python3 scrapeBooking.py

The card will isolate the cards HTML

On closer inspection, you’ll notice that the hotel’s name is always preceded by the sr-hotel_name_class… While we’re at it, let’s obtain the number of reviews, pricing, and ratings.

for item in soup.select('.sr_property_block'):

try:

print('----------------------------------------')

print(item.select('.sr-hotel__name')[0].get_text().strip())

print(item.select('.hotel_name_link')[0]['href'])

print(item.select('.bui-review-score__badge')[0].get_text().strip())

print(item.select('.bui-review-score__text')[0].get_text().strip())

print(item.select('.bui-review-score__title')[0].get_text().strip())

print(item.select('.hotel_image')[0]['data-highres'])

print(item.select('.bui-price-display__value')[0].get_text().strip())

We also attempted to obtain the hotel image and link, as well as other critical pieces of information.

This is how the entire code appears.

soup=BeautifulSoup(response.content,'lxml')

#print(soup.select('.a-carousel-card')[0].get_text())

for item in soup.select('.sr_property_block'):

try:
print('----------------------------------------')
print(item.select('.sr-hotel__name')[0].get_text().strip())
print(item.select('.hotel_name_link')[0]['href'])
print(item.select('.bui-review-score__badge')[0].get_text().strip())
print(item.select('.bui-review-score__text')[0].get_text().strip())
print(item.select('.bui-review-score__title')[0].get_text().strip())
print(item.select('.hotel_image')[0]['data-highres'])
print(item.select('.bui-price-display__value')[0].get_text().strip())

print('----------------------------------------')

except Exception as e:

#raise e

print('')

When you execute the code

This provides all the information we require.

Overcoming IP Blocks

Participating in a personal rotating proxy service such as Proxies API can often mean the difference between a successful and pain-free web scraping services that consistently gets the job done and one that never does.

Plus, with both the current offer of 1000 free API requests, there’s absolutely nothing to lose by comparing notes while using our rotating proxy. It simply takes a single line of integration to make it almost unnoticeable.

Our rotational proxy server Proxies API is a simple API that instantly solves any IP Blocking issues.

There are millions of high-speed spinning proxies scattered over the globe.

With our IP rotation service, you can rest assured that your IP address will be changed

Hundreds of our customers have successfully solved the headache of IP blocks with a simple API using our automatic User-Agent-String rotation (which simulates requests from different, valid web browsers and web browser versions) and our automatic CAPTCHA cracking technology.

In any programming language, a basic API like the one below can be used to access the entire system.

curl "http://api.iwebscraping.com/?key=API_KEY&url=https://example.com"

For more details, contact iWeb Scraping, today!!!

Frequently Asked Questions

The primary advantage is scalability and real-time business intelligence. Manually reading tweets is inefficient. Sentiment analysis tools allow you to instantly analyze thousands of tweets about your brand, products, or campaigns. This provides a scalable way to understand customer feelings, track brand reputation, and gather actionable insights from a massive, unfiltered source of public opinion, as highlighted in the blog’s “Advantages” section.

By analyzing the sentiment behind tweets, businesses can directly understand why customers feel the way they do. It helps identify pain points with certain products, gauge reactions to new launches, and understand the reasons behind positive feedback. This deep insight into the “voice of the customer” allows companies to make data-driven decisions to improve products, address complaints quickly, and enhance overall customer satisfaction, which aligns with the business applications discussed in the blog.

Yes, when using advanced tools, it provides reliable and consistent criteria. As the blog notes, manual analysis can be inconsistent due to human bias. Automated sentiment analysis using Machine Learning and AI (like the technology used by iWeb Scraping) trains models to tag data uniformly. This eliminates human inconsistency, provides results with a high degree of accuracy, and offers a reliable foundation for strategic business decisions.

Businesses can use a range of tools, from code-based libraries to dedicated platforms. As mentioned in the blog, popular options include Python with libraries like Tweepy and TextBlob, or dedicated services like MeaningCloud and iWeb Scraping’s Text Analytics API. The choice depends on your needs: Python offers customization for technical teams, while off-the-shelf APIs from web scraping services provide a turnkey solution for automatically scraping Twitter and extracting brand insights quickly and accurately.

Share this Article :

Looking for Scalable Scraping Solutions?

Get tailored extraction services built for enterprise and startup needs alike.

Continue Reading

E-Commerce

How to Track Wayfair Prices and Never Miss a Furniture Deal?

Before you buy that sofa, dining table, or bed you’ve been eyeing on Wayfair, it’s worth knowing that the price …

iwebscraping Reading Time: 8 min

Social Media

Scrape Email from YouTube Channels for Creator Outreach

YouTube has become one of the largest platforms for creators, influencers, educators, reviewers, coaches, brands, and niche communities. Every channel …

iwebscraping Reading Time: 12 min

Business

Extract US Government and Public Records for Business Intelligence

U.S. government and public records are the most valuable sources of business intelligence that are both free and accessible to …

iwebscraping Reading Time: 8 min

Build the Right Solution for You

Share your requirements, and we will definitely deliver a solution that will satisfy your needs perfectly!

Quick Response

Fast replies guaranteed

Expert Team

Driven by expertise

Secured Process

Built with strong security

Ongoing Support

Support whenever you need

Save Time & Money

Bulk data delivery in less time.

Complex & Varied Data

Hassle-free handling of JavaScript, logins, APIs, and dynamic.

Custom-Built Pipeline

Designed as per your requirements and scalability.

Social Media :

Managed Extraction

Engineering & Delivery

By Use Case

By Industry

Categories

APIs

Web Scraping API

APIs

Web Scraping API

Web Scraping API

Web Scraping API

How Python and BeautifulSoup are Used to Scrape Hotel Listings from Booking.com?

Overcoming IP Blocks

Frequently Asked Questions

Table of Contents

Looking for Scalable Scraping Solutions?

Continue Reading

How to Track Wayfair Prices and Never Miss a Furniture Deal?

Scrape Email from YouTube Channels for Creator Outreach

Extract US Government and Public Records for Business Intelligence

Build the Right Solution for You

Quick Response

Expert Team

Secured Process

Ongoing Support

Save Time & Money

Complex & Varied Data

Custom-Built Pipeline

Let’s Understand Your Data Requirements

Managed Extraction

Engineering & Delivery

By Use Case

By Industry

Categories

APIs

Web Scraping API

APIs

Web Scraping API

Web Scraping API

Web Scraping API

How Python and BeautifulSoup are Used to Scrape Hotel Listings from Booking.com?

Overcoming IP Blocks

Frequently Asked Questions

What is the main advantage of using Twitter sentiment analysis for business?

How can Twitter sentiment analysis improve customer experience?

Is automated Twitter sentiment analysis reliable for business decisions?

What tools can a business use to perform Twitter sentiment analysis?

Table of Contents

Looking for Scalable Scraping Solutions?

Continue Reading

How to Track Wayfair Prices and Never Miss a Furniture Deal?

Scrape Email from YouTube Channels for Creator Outreach

Extract US Government and Public Records for Business Intelligence

Build the Right Solution for You

Quick Response

Expert Team

Secured Process

Ongoing Support

Save Time & Money

Complex & Varied Data

Custom-Built Pipeline

Let’s Understand Your Data Requirements