How to get GitHub profile image using web scraping in python?

How to get GitHub profile image using web scraping in python?

Vipul kunwar's photo
Vipul kunwar
·Dec 17, 2021·

9 min read

Subscribe to my newsletter and never miss my upcoming articles

Listen to this article

Table of contents

  • Requests library
  • BeautifulSoup library to extract the data from HTML page<br>
  • What's the problem?
  • Wrap all together
This image showing the question

Did you get what I'm trying to say?

No...

Then I'm saying that how will you get GitHub profile image using python web scraping?
Wait a minute, did you know what the web scraping?
Oh! Sorry, I forget to tell you.

According to Wikipedia, web scraping is a
Technique to extract the data from a website using scraping tools.

First of all, I've learnt this project from freecodecamp.
You should check out this YouTube channel.
Second, I'm going to use two python web scraping libraries. These are requests and BeautifulSoup.

If you couldn't understand what these libraries are you couldn't move forward.
So, let's see the short basics of these libraries.

Requests library

image.png

Wikipedia.org)

I'm starting with the requests library.
The requests library make request to the HTML page using HTTP request.
It's a door bell where you're making requests to someone's house.

In web scraping, before extracting the data from the HTML page you need to make the request
To that page.
You can extract the data when request is accepted using HTML 200 request.
It's like before entering someone's house you need to ring the doorbell.
If the door opened then you can enter in the house.

Oh, you're thinking what will happen if we didn't make the request?
So, when you enter into someone's house without permission.
You've to fight against the house owner. Or the house owner can call police.

Whatever, you're not going to do this.
Here, you've understood what the requests library does.
If you want to know how, when and why of the requests library, Check out here.

Program of the requests library

Here I'm going to first install the requests library, then import it.
You can install the requests using pip install requests.

After, installing this you've to import it.
For this, you've to use import requests.

Till here, we've worked in requests library.
So, using request library, we can make the request to the HTML page.

If request accepted using HTML 200 request. Then, you can use the BeautifulSoup that will extract data from the HTML page.

BeautifulSoup library to extract the data from HTML page

image.png

Image credit: digitalocean.com

It's a python data extracting tool. This tool extracts the data from HTML and XML files using the parse tree.
The most recent version of BeautifulSoup is bs4.
You can install it using pip install beautifulsoup4.
After this, I'm going to import BeautifulSoup from bs4.

Import the beautifulSoup

In this section, I'm going to import the BeautifulSoup package.
You can import it using the below line of the code.

For example,
from bs4 import BeautifulSoup as bs

In the above code, you can see that BeautifulSoup as bs. Because every time we've to write BeautifulSoup, it becomes big and complex.
To make it simple I've used bs instead of BeautifulSoup.

If you want to know more about BeautifulSoup, check out here.

At last, I've given you a brief description of the two libraries that I'm going to use.
If you forget, here are the recap of it.

  • requests library: The HTTP request to the HTML page
  • BeautifulSoup: Extract the data from requesting a page if request is accepted from the HTML page.

So, you've understood these libraries. At last, I'm going to create code with these libraries.
Before writing code let's see the problem.

What's the problem?

Asking question

In this section, I'm going to highlight the problem.
And after highlighting it, I'm going to propose the solution.
You can propose a different solution in the comment section.

So, Let's come back to the problem.
We want to get the GitHub profile image in our program. To get this, first, we need the user input.

In user input, you've to put the user name. When you put the user name, then the profile image of The user extract from the GitHub.

Here are the things we've to do;

  • Make a request to the user account in the GitHub
  • Second, extract the GitHub profile image from the user account.

Step-by-step code to get the github profile image with the python

Step by step

First of all, you've to import both libraries. So, before I've imported these libraries.
Let's mix up this. When you import both libraries your code looks like-

import requests
from bs4 import BeautifulSoup as bs

That's how I managed to import both libraries.

Next, I've to take the input from the user. This will happen using the input() method. Your program looks like

import requests
from bs4 import BeautifulSoup as bs
# take the github profile from user. Here you'll put the user name
github_user = input('Input github user: ')

We've given, the input() method that takes the input from the user.

Also, I'm going to concatenate the user name with the github website.
I'm taking this step because we want to extract data from github. so, it's obvious.
We've to use code as url = "https://github.com/"+github_user.

Here, first, we're going inside GitHub's website and then getting particular GitHub's users..
I've created the specific URL that I want to make a request.
When I combine this code with the main program, it looks like;-

import requests
from bs4 import BeautifulSoup as bs
# take the github profile from user. Here you'll put the user name
github_user = input('Input github user: ')

# Create a specified url to make the request
url = "https://github.com/"+github_user

I've created the URL that I want to make the request.
After this, I've to make a request and extract the GitHub profile image.
So, here I'm going to make the request to GitHub using the get() Method.

If you aren't familiar with get() method. Then In brief I want to tell you that this is an HTML method. That makes a request to the specified URL.
Let's see how it looks;
r = requests.get(url)

When the above line of the code adds to the program:

import requests
from bs4 import BeautifulSoup as bs
# take the github profile from user. Here you'll put the user name
github_user = input('Input github user: ')

# Create a specified url to make the request
url = "https://github.com/"+github_user

# make the request to the specific url using get() method
r = requests.get(url)

Next, you've to get the content of the URL. To do this you've to use the below line
soup = bs(r.content,"html.parser")

Here the bs extract the parsed page.

  • r.content: Shows the data associated to that page
  • "html.parser": Break down the html page in small components.

When the above code line adds to the program then we get:-->

import requests
from bs4 import BeautifulSoup as bs
# take the github profile from user. Here you'll put the user name
github_user = input('Input github user: ')

# Create a specified url to make the request
url = "https://github.com/"+github_user

# make the request to the specific url using get() method
r = requests.get(url)

# Get the data associated to that page
soup = bs(r.content,"html.parser")

Before adding to the next line of code I'm going to show you the Inspection tool.
So, see the below image

image.png

The above image is the data of the user in GitHub.
We want the profile image.
And you can see that there are many images such as the GitHub logo. So, to avoid this confusion, I'll use the Inspection tool.

To do this first we'll click on profile image then go to the inspection tool.
For example,

image.png

When you click on the inspect then you'll see the source code.
For example,

image.png

Because the line of the code of the profile image is

<img style="height:auto;" alt="" class="avatar avatar-user width-full border color-bg-default" src="https://avatars.githubusercontent.com/u/95694307?v=4" width="260" height="260">

First thing in the above code, you can observe that-

  • Image type is avatar image
  • It's the source(src) image

So, our next line of code:
profile_image = soup.find('img',{'alt':'Avatar'})['src']

Here find method finding the Avatar image with source(src) in html page.
When this line of code adds to the main program

import requests
from bs4 import BeautifulSoup as bs
# take the github profile from user. Here you'll put the user name
github_user = input('Input github user: ')

# Create a specified url to make the request
url = "https://github.com/"+github_user

# make the request to the specific url using get() method
r = requests.get(url)

# Get the data associated to that page
soup = bs(r.content,"html.parser")

# find the avatar image
profile_image = soup.find('img',{'alt':'Avatar'})['src']

So, this is our program. The last thing I've to do is to display the profile image using print() statement.

Thus, our final code is

import requests
from bs4 import BeautifulSoup as bs
# take the github profile from user. Here you'll put the user name
github_user = input('Input github user: ')

# Create a specified url to make the request
url = "https://github.com/"+github_user

# Create a specified url to make the request
url = "https://github.com/"+github_user

# make the request to the specific url using get() method
r = requests.get(url)

# Get the data associated to that page
soup = bs(r.content,"html.parser")

# find the avatar image
profile_image = soup.find('img',{'alt':'Avatar'})['src']

# display the profile image
print(profile_image)

When you execute the above code, it provides the output

Input github user:

Next, I'm going to put the user name i.e.vipulkunwar000

Input github user: vipulkunwar000

When you make enter to the vipulkunwar000. you'll get link something as:

Input github user: vipulkunwar000
https://avatars.githubusercontent.com/u/95694307?v=4

So, you've seen the link. when you click on the link you get the github profile image as:

image.png

Thus, you can put the GitHub's user name. And get the link to the GitHub profile image.
When you click on the link, you get the image.
At last, you've seen the steps to get the GitHub profile Image of the user.

So, do the web scraping of your friend's GitHub profile image and make oh... Moment between them.
Also, don't forget to check out the freecodecamp.
They have tons of the Python projects.

Hey, if you liked it then subscribe to my brand awesome python blogs.
I know you want to connect with me on twitter. Let's walk here--> My twitter.

The last part I'm going to wrap all this in one package:

Wrap all together

Hey, you get the GitHub profile image of your friend.

In this tutorial, you've seen the two most precious python libraries for web scraping. That is requests and BeautifulSoup

Requests library can use when you need to make a request to the HTML page.
It works as a doorbell to the house. Before using this you need to install the requests using pip install requests.

Next, you've seen the BeautifulSoup that extracts the data from the HTML page.
The most recent version of its bs4.
To find the profile image you need to use the find() method in bs4.

At last, I'm going to display the user input using input() method.
And you got the Input github user:
When you put your name then you'll get the link of profile image such as,

Input github user: vipulkunwar000
https://avatars.githubusercontent.com/u/95694307?v=4

You got the profile image, what you're going to do with it?
I'm going to astonish my friends.

Visit my previous blogs
Showing the below posts

Did you find this article valuable?

Support Vipul kunwar by becoming a sponsor. Any amount is appreciated!

Learn more about Hashnode Sponsors
 
Share this