Downloading files from the internet is something that almost every programmer will have to do at some point. Python provides several ways to do just that in its standard library. Probably the most popular way to download a file is over HTTP using the urllib or urllib2 module. Python also comes with ftplib for FTP downloads. Finally there’s a new 3rd party module that’s getting a lot of buzz called requests. We’ll be focusing on the two urllib modules and requests for this article.
Since this is a pretty simple task, we’ll just show a quick and dirty script that downloads the same file with each library and names the result slightly differently. We will download a zipped file from this very blog for our example script. Let’s take a look:
# Python 2 code import urllib import urllib2 import requests url = 'https://www.blog.pythonlibrary.org/wp-content/uploads/2012/06/wxDbViewer.zip' print "downloading with urllib" urllib.urlretrieve(url, "code.zip") print "downloading with urllib2" f = urllib2.urlopen(url) data = f.read() with open("code2.zip", "wb") as code: code.write(data) print "downloading with requests" r = requests.get(url) with open("code3.zip", "wb") as code: code.write(r.content)
As you can see, urllib is just a one-liner. It’s simplicity makes it very easy to use. On the other hand, the other two libraries are very simple too. For urllib2, you just have to open the url and then read it and write the data out. In fact, you could reduce that part of the script by one line by just doing the following:
f = urllib2.urlopen(url) with open("code2.zip", "wb") as code: code.write(f.read())
Either way, it works quite well. The requests library method is get, which corresponds to the HTTP GET. Then you just take the requests object and call its content property to get the data you want to write. We use the with statement because it will automatically close a file and simplifies the code. Note that just using “read()” can be dangerous if the file is large. It would be better to read it in pieces by passing read a size.
Update (June 8, 2012)
As pointed out by one of my readers, the urllib stuff changes considerably if you run it through the 2to3.py so that it’s in Python 3 format. So for completeness, here’s what the code looks like now:
# Python 3 code import urllib.request url = 'https://www.blog.pythonlibrary.org/wp-content/uploads/2012/06/wxDbViewer.zip' print("downloading with urllib") urllib.request.urlretrieve(url, "code.zip") print("downloading with urllib2") f = urllib.request.urlopen(url) data = f.read() with open("code2.zip", "wb") as code: code.write(data)
You’ll notice that urllib2 no longer exists and that urllib.urlretrieve and urllib2.urlopen changed into urllib.request.urlretrieve and urllib.request.urlopen respectively. The rest is the same. I removed the requests portion for brevity.
So there you have it! Now you too can start downloading files using Python 2 or 3!
Further Reading
- StackOverFlow: How do I download a file over HTTP using Python?
- Downloading a file over the web recipe
Yeah, I wanted to do that too, but I wasn’t coming up with a good example that just anyone could do. Maybe I’ll talk about that in a future post though.
It would be instructional to run 2to3 over your code. You’ll notice that urllib2 is no more (urllib2.urlopen becomes urllib.request.urlopen) and urlib.urlretrieve becomes urllib.request.urlretrieve.
Pep20:
…
13. There should be one– and preferably only one –obvious way to do it.
…
Once you’ve worked with anything beyond this toy example, you’ll realize requests is the only sane way to do this in python.
Why is there urllib and urllib2? God if I know, but the only thing python beginners should learn is requests.
Yeah, I know there should be only one way. I haven’t used requests enough yet since I only discovered it a couple days ago. Since it’s not in the standard library, I thought I should show other ways to do this sort of thing.
Pingback: Visto nel Web – 30 « Ok, panico
Pingback: download a file from a website which requires authetication using python | Code and Programming
Would someone post code that downloads a file after the user logs into the website and accepts an agreement? I am struggling with that code right now. Thanks so much.
thank you so much!!!
Pingback: Python:Python: download a file over an FTP server – IT Sprite
can anyone help me with downloading an exe file and installing through script? I’m relatively new to python.
You should be able to use the concepts outlined in this tutorial to download any type of file. Running the file would require calling Python’s subprocess module. Going through an installation wizard is difficult and would require something like PyWinAuto
yeah,i’m trying to use this concept but I’m facing an issue with the proxy settings in my laptop which is connected to my company VPN. PywinAuto should probably help me.
ERROR number (10060) i tried all possible things mentioned in stackoverflow. Is there anything you could suggest to resolve this error
I haven’t had to deal much with proxies, so you’ll have to do some digging. I did find this answer that looks promising though: https://stackoverflow.com/questions/5620263/using-an-http-proxy-python. Your IT department might also be blocking people from downloading executables. If so, you will need to contact them to see if they will let you get around those restrictions.
thank you , it helped me clearing the error. 🙂
thank you! i completed my module after some learning ,starting with your code!!
thank you, i completed with my module lot of learning,starting from your blog.
Great tutorial! Thanks!
This is a nice tutorial.
What if the webpage has a clickable download icon which if clicked, download starts. How to approach such case?
Pingback: wxPython: Creating a File Downloading App - The Mouse Vs. The Python
Pingback: PySide: Creating a Currency Converter - The Mouse Vs. The Python
Pingback: Python 101: Downloading a File with ftplib - The Mouse Vs. The Python
Pingback: How to Download a File with Python (Video) - The Mouse Vs. The Python