Download Files Using Curl command: Comprehensive Guide

This tutorial dives deep into the ways of file downloading using curl command.

We’ll walk through the basics of HTTP downloads, secure HTTPS transfers, play with FTP servers, and even navigate advanced features like segmented download and speed adjustments.

 

 

Download to Standard Output

If you use the curl command without any flag, it will send the download content straight to standard output.

curl http://example.com/sample.txt

This command will display the contents of sample.txt directly in your terminal.

You can utilize the power of the Unix pipeline to process the data immediately. For instance, to search for a specific term in a file without downloading it:

curl http://example.com/sample.txt | grep "search_term"

 

Download & Save to a File

When you want to download files from websites that use the HTTP protocol, the curl command makes it straightforward. The simple structure below allows you to fetch files directly to your computer.

curl -O http://example.com/sample.zip

This command fetches the file sample.zip from http://example.com and saves it in the current directory. The -O flag instructs curl to save the file with its original name.

 

Verify/Bypass SSL certificates

If you want to ensure that the server’s certificate is valid or choose to bypass the verification when you’re certain about the source.

Using the –cacert

curl --cacert /path/to/cacert.pem -O https://secure-example.com/file.zip

This command fetches the file from a secure HTTPS site, while ensuring the server’s certificate is verified against the given certificate file /path/to/cacert.pem.

Using –insecure

curl --insecure -O https://secure-example.com/file.zip

By using the --insecure flag, curl bypasses the certificate verification process.

Although this can be useful in specific situations such as internal networks, it’s potentially risky and should be used with caution.

 

Download from FTP Servers

To download a file from an FTP server, use:

curl -O ftp://ftp.example.com/path/to/file.zip

If the FTP server requires a username and password, provide them in the URL or use the -u flag:

curl -O ftp://username:password@ftp.example.com/path/to/file.zip

Or:

curl -u username:password -O ftp://ftp.example.com/path/to/file.zip

FTP operates in two modes: Active and Passive. By default, curl uses Passive mode. If you need to enforce Active mode, use the --ftp-port option:

curl --ftp-port - -O ftp://ftp.example.com/path/to/file.zip

In this case, the ‘-‘ after --ftp-port tells curl to pick an appropriate local port number.

 

Determine File Size

Determining the size of a file before downloading can be essential for various reasons: to ensure sufficient disk space, to perform segmented downloads, or simply to get an idea of download duration.

You can use -I flag to send a head request to get only the headers without downloading the file.

curl -I http://example.com/largefile.zip | grep Content-Length

In the response, the Content-Length header indicates the size of the file in bytes.

 

Ranges and Partial Downloads

You can use -r or --range to download a specific part of a file:

curl -O http://example.com/sample.mp3 -r 10000-50000

The -r flag allows you to specify a byte range, ensuring only that portion of the file gets downloaded.

In this example, bytes 10,000 to 50,000 of the sample.mp3 file are fetched.

 

Segmented Downloading

Also known as multi-connection or multi-part downloading, involves splitting a file into several parts and downloading those parts separately.

While curl natively doesn’t support segmented downloads, you can still achieve this manually using curl with a bit of effort.

First, you’ll want to know the total size of the file in bytes. Use curl with the -I flag to fetch only the headers as we did above:

curl -I http://example.com/largefile.zip | grep Content-Length

Then divide the file into segments:

Suppose the file size is 3000 bytes, and you want to download it in 3 segments. Your segments would be:

  • 0-999 bytes
  • 1000-1999 bytes
  • 2000-2999 bytes

Next, download each segment, using -r or --range:

curl -r 0-999 -o segment1 http://example.com/largefile.zip
curl -r 1000-1999 -o segment2 http://example.com/largefile.zip
curl -r 2000-2999 -o segment3 http://example.com/largefile.zip

After downloading, you can combine the segments to reconstruct the original file:

cat segment1 segment2 segment3 > combinedfile.zip

 

Download Multiple Files Sequentially

When you have multiple files to download from a server, curl can handle them all in a single command, making the process efficient and organized.

curl -O http://example.com/file1.zip -O http://example.com/file2.zip

By chaining -O flags followed by URLs, you instruct curl to download each of these files to the current directory.

If you have multiple URLs stored in a file, you can download all of them using:

xargs -n 1 curl -O < urls.txt

Here, urls.txt should have one URL per line.

Downloading files sequentially will take much time as files are downloaded one after the other.

To speed things up, curl can download files in parallel using --parallel flag.

 

Download Multiple Files Concurrently

You can use --parallel flag to download multiple URLs simultaneously:

curl --parallel -O http://example.com/file1.zip -O http://example.com/file2.zip

Here, both file1.zip and file2.zip will be downloaded simultaneously.

This can significantly speed up the transfer process, especially when dealing with multiple small files or when the server limits the speed per connection.

By default, curl will attempt to download up to 50 URLs in parallel. However, if you want to adjust this to suit your bandwidth or the server’s capabilities. Here comes the parallel-max flag.

Note: Ensure that your version of curl supports the --parallel option because the --parallel option in version 7.66.0.

 

Setting Maximum Simultaneous Transfers

If you’re dealing with numerous downloads simultaneously, you can set a limit to avoid potential network or resource constraints.

curl --parallel --parallel-max 3 -O http://example.com/file1.zip -O http://example.com/file2.zip -O http://example.com/file3.zip

Using the --parallel-max flag followed by a number, you can limit the maximum simultaneous transfers. In this example, curl limits it to 3.

 

Resume Interrupted Downloads

If your download is broken or interrupted for any reason, you don’t have to start over. You can continue right where you left off.

curl -C - -O http://example.com/largefile.zip

The -C - flag informs curl to continue the download from where it left off, in case of interruptions.

This is especially useful for unreliable network connections.

Manually Specifying the Resume Point

If know the byte number at which the download was interrupted. You can manually specify this with the -C option:

curl -C 50000 -O http://example.com/largefile.zip

In this example, the download will resume starting from byte 50000.

After finishing an interrupted download, always ensure file integrity. Many websites provide MD5, SHA-1, or SHA-256 hashes alongside their download links.

You can use tools like md5sum, sha1sum, or sha256sum to verify the integrity of your downloaded file against the provided hash.

Note: If a server doesn’t support byte ranges or partial requests, using the -C option won’t have any effect, and the download will start from the beginning.

 

Download in background

To start a curl download in the background, you can use the & symbol at the end of your command:

curl -O http://example.com/largefile.zip &

This will initiate the download, but immediately return control to the terminal.

To check the status of your background job, use the jobs command.

This will display all current background tasks associated with your terminal session.

If you need to bring the download back to the foreground, perhaps to check its progress in detail or to gracefully stop it, use the fg command.

When running curl in the background, it’s a good practice to redirect its output, especially if you’re expecting error messages.

curl -O http://example.com/largefile.zip > download.log 2>&1 &

Here, both standard output and error messages are redirected to download.log.

If you want to keep downloading and close the terminal, use nohup to ensure the download continues:

nohup curl -O http://example.com/largefile.zip &

The nohup command ensures that the curl operation continues even after the terminal is closed.

 

Setting Download Speed Limits

To avoid hogging network resources, you should limit the download speed.

curl --limit-rate 200K -O http://example.com/largefile.zip

The --limit-rate flag followed by a speed (e.g., 200K for 200 kilobytes per second) sets a speed limit for the download.

The --limit-rate option supports various units for speed:

  • K or k for kilobytes.
  • M or m for megabytes
  • G or g for gigabytes
  • You can also just specify the rate in bytes without any unit.
curl --limit-rate 1M -O http://example.com/largefile.zip

This restricts the download speed to 1 megabyte per second.

 

Handle Slow/Stalled Download

In scenarios where the download speed drops below a certain threshold for a specified duration, you might want to abort the transfer. This can be useful in detecting stalled connections.

We have two flags that allow us to control this:

--speed-limit <rate>: This sets a transfer speed, measured in bytes per second, that the transfer should be above over a specified period for the transfer to continue.

--speed-time <seconds>: This sets the time, in seconds, the transfer speed should be below the specified --speed-limit for the transfer to be considered too slow and thus aborted.

curl --speed-time 30 --speed-limit 1000 -O http://example.com/largefile.zip

In this example, if the download rate goes below 1000 bytes/sec for 30 seconds, curl will stop the download.

 

Retry on Error

Network issues or server errors can sometimes disrupt downloads. Instead of manually restarting the download, curl can be set to retry automatically.

curl --retry 5 --retry-max-time 120 --retry-delay 10 --retry-all-errors -O http://example.com/file.zip

Here:

  • --retry 5 instructs curl to retry up to 5 times.
  • --retry-max-time 120 sets the total time, in seconds, that curl should spend on retries.
  • --retry-delay 10 sets a delay of 10 seconds between retries.
  • --retry-all-errors makes curl retry on all errors, not just transient ones.

 

Handling VPN disconnects during downloads

When using VPN connections, temporary disconnects can disrupt your downloads. While curl doesn’t directly handle VPN reconnects, using the retry options can help in such scenarios.

curl --retry 5 --retry-max-time 120 -O http://example.com/file.zip

The retry flags allow curl to stand against minor connectivity issues, giving your VPN a chance to reconnect and resume the operation.

 

Downloading Files from Password-Protected URLs

To avoid exposing passwords in the command line history or scripts, you can use curl in a way that it prompts you for the password:

curl -u username --prompt -O http://example.com/protectedfile.zip

Here, curl will ask you to enter the password interactively.

If you regularly access password-protected resources, it might be cumbersome to enter credentials repeatedly. The .netrc file is a way to store these. Create a .netrc file in your home directory:

touch ~/.netrc

Add credentials in the following format:

machine example.com
login username
password yourpassword

Then, use curl without specifying the -u option:

curl --netrc -O http://example.com/protectedfile.zip

Note: Always ensure your .netrc file permissions are set to 600 (chmod 600 ~/.netrc) to prevent unauthorized access.

Note: Always ensure you’re transmitting sensitive credentials over secure channels, preferably HTTPS.

 

Using Proxies

To use a proxy when downloading a file with curl, you can use the -x or --proxy option:

curl -x http://proxyserver:port -O http://example.com/file.zip

If the proxy requires authentication, you can provide credentials using the -U or --proxy-user option:

curl -x http://proxyserver:port -U username:password -O http://example.com/file.zip

curl supports multiple proxy protocols like HTTP, HTTPS, SOCKS4, and SOCKS5. To specify a proxy protocol, include it in the proxy string:

curl -x socks5://proxyserver:port -O http://example.com/file.zip

If you’re frequently using the same proxy, you can set environment variables. For an HTTP proxy, you can set:

export http_proxy=http://proxyserver:port

curl will automatically use this proxy for all subsequent requests. Remember to also set https_proxy.

 

Displaying the download progress bar

Visual feedback during downloads can be helpful to gauge progress. Curl can provide a progress bar for this purpose.

curl -# -O http://example.com/largefile.zip

The -# flag instructs curl to display a progress bar instead of the default statistics.

Use the -s or --silent option to mute all progress indicators:

curl -s -O http://example.com/largefile.zip

You can combine the silent mode with the verbose mode, in case you want detailed information about the download but not the progress bar:

curl -s -v -O http://example.com/largefile.zip

Note: Always be cautious while using verbose mode, especially with password-protected URLs.

Verbose mode will display all sent headers which will expose sensitive data in your console.

 

Download Files based on Modification Time

When synchronizing files, it’s useful to only fetch files that have changed since a certain date.

The --time-cond flag followed by a date ensures that the file is only downloaded if it has been modified after the specified date.

curl -O http://example.com/file.zip --time-cond "2023-08-22"

For automation, compare the timestamp of a local file with the remote file:

curl --time-cond localfile.zip -O http://example.com/file.zip

In this case, curl checks the last modification time of “localfile.zip” and only downloads the remote file if it’s newer.

To download files that are older than the specified date, prefix the date with a -:

curl -O http://example.com/file.zip --time-cond "-2023-08-22"

This command fetches the file only if it was modified before the given date and time.

 

Adjusting Connection Timeouts

For better control over connections, you can set how long curl should wait before giving up on establishing a connection.

curl --connect-timeout 10 -O http://example.com/file.zip

The --connect-timeout flag sets the maximum time, in seconds, curl should spend trying to connect.

 

Setting Maximum Time Limit

While --connect-timeout applies only to the connection phase, --max-time limits the total time for the entire operation.

To prevent a download from taking too long, you can set an upper time limit.

curl --max-time 300 -O http://example.com/largefile.zip

Here, curl will abort if the whole operation (connection + download) takes longer than 300 seconds.

 

Stop Curl Sanitization

By default, curl will sanitize the URL. For instance, if you give curl the URL “http://example.com/../test/file.txt”, it will strip out the “..” sequence and request “http://example.com/test/file.txt” from the server.

You can use --path-as-is if you want curl to request a URL without cleaning or modifying it.

curl --path-as-is -O http://example.com//directory/../file.zip

The --path-as-is flag ensures curl doesn’t sanitize or alter the provided path.

Note: avoiding URL sanitization can expose potential security issues, especially if you’re interacting with untrusted servers or sites.

This is useful if you’re a developer or a security researcher and you want to test how your server or application responds to unsanitized URLs.

Leave a Reply

Your email address will not be published. Required fields are marked *