Creating a rotating proxy in AWS using the Java SDK

AWS EC2 instances can be used to create a HTTP proxy server, so when a client browser using the proxy browses the internet, the AWS EC2 instance’s public IP address effectively becomes their IP address. This may be useful for anonymity, for example if you’re browsing the Internet from home but want to mask your IP address.

Furthermore, you can even have the IP address of your AWS EC2 instance change, by releasing and attaching a new AWS Elastic IP to it, thus “rotating” the public IP of the HTTP proxy. This way you can achieve even more anonymity by using an ever changing IP address.

This is a guide on how to use an AWS EC2 instance (particularly Linux) to create a rotating HTTP proxy. We’ll achieve this using the AWS Java SDK.

To get started start, install tinyproxy on your EC2 instance. SSH into it, and run the following command:

sudo yum -y install tinyproxy –enablerepo=epel

Then edit /etc/tinyproxy/tinyproxy.conf. Note the port, which should be 8888 by default. Make sure the following options are set:

BindSame yes
Allow 0.0.0.0/0
#Listen 192.168.0.1 (make sure this is commented out, meaning line starts with #)
#Bind 192.168.0.1 (make sure this is commented out, meaning line starts with #)

Fire up the tinyproxy by running:

sudo service tinyproxy start

You may also want to add the same command (without the sudo) to /etc/rc.local so tinyproxy is started whenever the EC2 instance is restarted. There’s a proper way to indicate in Linux what services to start on system startup, but I’m forgetting how, and being too lazy to look it up right now :). Adding this command to /etc/rc.local will certainly do the trick.

Now set your web browser (or at the OS level) to use an HTTP proxy by pointing the settings to the public IP address of the EC2 instance. If you don’t know the IP already, you can get it using the AWS EC2 web console. Or by typing the following command on the EC2 server shell:

wget http://ipinfo.io/ip -qO –

You can now go to Google and type in “What is my IP address”. Google will show you, and you’ll notice that it’s not your real IP, but the public IP of the EC2 instance you’re using as a proxy.

Before we move on, let’s set up some security group settings for the EC2 instance to prevent access. This is necessary so not everyone on the Internet can use your proxy server. The best way to go about this is to use the AWS EC2 web console. Navigate to the security group of the EC2 instance, and note the “Group Name” of the security group (we’ll use that later). Add a custom inbound TCP rule to allow traffic from your IP address to port 8888 (or whatever you configured the proxy to run on).

Next what you need to do is to attach new network interfaces to your EC2 instance (one or multiple). This is so that you can have additional network interfaces that you can map an elastic IP address to, as you don’t want to mess with the main network interface so you can have at least one static IP so you can connect to your EC2 instance for whatever reason. The other network interfaces will rotate their public IPs by attaching and releasing to Elastic IPs (AWS seems to have an endless pool of Elastic IPs, you get a new random one every time you release an Elastic IP and reallocate a new one… this works in our favor so we get new IPs every time).

To attach an Elastic Network Interface to your EC2 instance, check out this documentation: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html. Also note that depending on the type of EC2 instance, you only get to allocate a certain number of network interfaces (for t2.micro, I believe the limit is 1 default and 2 additional (so 3 total)). Lastly, take note of the Elastic Network Interface IDs and their corresponding private IP addresses, once you create them. We’ll use them in our java code.

Now, below is a Java code segment that can be used to assign and rotate Elastic IPs to your EC2 instance, which then become the IPs used as proxy. Note at the top of the code there are a number of configuration parameters (static class level variables) that you’ll need to fill out. And of course you’ll need to have the AWS Java SDK in your classpath.

The method associateAll() will associate the Elastic Network Interfaces provided with new Elastic IPs. And the method releaseAll() will detach the Elastic IPs from the Elastic Network Interfaces and release them to the wild (and thus a subsequent associateAll() will then return new IPs). associateAll() will return an ArrayList of Strings corresponding to the new Elastic IPs attached to the EC2 instance. And these IPs can then be used as the HTTP proxy (tinyproxy will automatically bind itself to the proxy port (8888) on the new public IP addresses, so you can connect to them from your client/browser).

Also note that associateAll() will authorize the public IP of the machine running this code by adding it to the EC2 security group to allow connection to TCP port 8888 (or whatever you configured your HTTP proxy port to be) going into the EC2 instance.

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.ArrayList;
import java.util.HashSet;

import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.ec2.AmazonEC2;
import com.amazonaws.services.ec2.AmazonEC2ClientBuilder;
import com.amazonaws.services.ec2.model.Address;
import com.amazonaws.services.ec2.model.AllocateAddressRequest;
import com.amazonaws.services.ec2.model.AllocateAddressResult;
import com.amazonaws.services.ec2.model.AmazonEC2Exception;
import com.amazonaws.services.ec2.model.AssociateAddressRequest;
import com.amazonaws.services.ec2.model.AssociateAddressResult;
import com.amazonaws.services.ec2.model.AuthorizeSecurityGroupIngressRequest;
import com.amazonaws.services.ec2.model.AuthorizeSecurityGroupIngressResult;
import com.amazonaws.services.ec2.model.DescribeAddressesResult;
import com.amazonaws.services.ec2.model.DomainType;
import com.amazonaws.services.ec2.model.IpPermission;
import com.amazonaws.services.ec2.model.IpRange;
import com.amazonaws.services.ec2.model.ReleaseAddressRequest;
import com.amazonaws.services.ec2.model.ReleaseAddressResult;

public class AWSProxyUtil 
{
	static String SECURITY_GROUP = "security-group-name-of-your-ec2-instance";
	static int DEFAULT_PROXY_PORT_TO_ASSIGN = 8888;
	static String PUBLIC_IP_TO_IGNORE = "1.2.3.4"; 	//This is the IP you want to remain static,
							//so you can connect to your EC2 instance.

	@SuppressWarnings("serial")
	static HashSet<String> NETWORK_ID_PRIVATE_IPs_TO_ASSOCIATE_WITH = new HashSet<String>()
	{{
		//These are the network interface IDs and their private IPs
		//that will be used to attach Elastic IPs to. Format is <ID>:<IP>.
		add("eni-xxxxxxxx:1.2.3.4");
		add("eni-xxxxxxxx:1.2.3.4");
		add("eni-xxxxxxxx:1.2.3.4");
	}};
	
	public static String AWS_ACCESS_KEY_ID = "xxx"; //Your AWS API key info
	public static String AWS_SECRET_KEY_ID = "xxx";

	public static Regions AWS_REGIONS = Regions.US_WEST_2;

	public static void releaseAll() throws Exception
	{
		debugSOP("Relasing elastic IPs");
		
		BasicAWSCredentials awsCreds = new BasicAWSCredentials(AWS_ACCESS_KEY_ID, AWS_SECRET_KEY_ID);
		final AmazonEC2 ec2 = 
				AmazonEC2ClientBuilder
					.standard()
					.withCredentials(new AWSStaticCredentialsProvider(awsCreds))
					.withRegion(AWS_REGIONS)
					.build(); 

		DescribeAddressesResult response = ec2.describeAddresses();

		for(Address address : response.getAddresses()) 
		{
			if(address.getPublicIp().equals(PUBLIC_IP_TO_IGNORE))
			{
				debugSOP(" * Keeping "+address.getPublicIp());
				continue;
			}
			debugSOP(" * Releasing "+address.getPublicIp());
			ReleaseAddressRequest releaseAddressRequest = new ReleaseAddressRequest().withAllocationId(address.getAllocationId());
			ReleaseAddressResult releaseAddressResult = ec2.releaseAddress(releaseAddressRequest);
			debugSOP("   * Result "+releaseAddressResult.toString());
		}
	}
	
	public static ArrayList<String> associateAll() throws Exception
	{
		ArrayList<String> result = new ArrayList<String>();
		
		debugSOP("Associating elastic IPs");
		
		BasicAWSCredentials awsCreds = new BasicAWSCredentials(AWS_ACCESS_KEY_ID, AWS_SECRET_KEY_ID);
		final AmazonEC2 ec2 = 
				AmazonEC2ClientBuilder
					.standard()
					.withCredentials(new AWSStaticCredentialsProvider(awsCreds))
					.withRegion(AWS_REGIONS)
					.build(); 

		DescribeAddressesResult response = ec2.describeAddresses();

		HashSet<String> alreadyAssociated = new HashSet<String>();
		for(Address address : response.getAddresses()) 
		{
			if(address.getPublicIp().equals(PUBLIC_IP_TO_IGNORE))
			{
				continue;
			}
			debugSOP(" * Already associated - Private IP: "+address.getPrivateIpAddress()+", Public IP: "+address.getPublicIp());
			result.add(address.getPublicIp()+":"+DEFAULT_PROXY_PORT_TO_ASSIGN);
			alreadyAssociated.add(address.getNetworkInterfaceId()+":"+address.getPrivateIpAddress());
		}
		
		for(String networkIdPrivateId : NETWORK_ID_PRIVATE_IPs_TO_ASSOCIATE_WITH)
		{
			if(alreadyAssociated.contains(networkIdPrivateId))
				continue;
			
			String fields[] = networkIdPrivateId.split(":");
			String networkId = fields[0];
			String privateIp = fields[1];

			AllocateAddressRequest allocate_request = new AllocateAddressRequest()
				    .withDomain(DomainType.Vpc);

			AllocateAddressResult allocate_response =
			    ec2.allocateAddress(allocate_request);

			String publicIp = allocate_response.getPublicIp();
			String allocation_id = allocate_response.getAllocationId();

			debugSOP(" * Associating Public IP "+publicIp+" to "+networkIdPrivateId);

			AssociateAddressRequest associate_request =
			    new AssociateAddressRequest()
			    	.withNetworkInterfaceId(networkId)
			    	.withPrivateIpAddress(privateIp)
			        .withAllocationId(allocation_id);
			
			AssociateAddressResult associate_response =
				    ec2.associateAddress(associate_request);
			
			debugSOP("   * Result "+associate_response.toString());
			
			result.add(publicIp+":"+DEFAULT_PROXY_PORT_TO_ASSIGN);
		}
		
		debugSOP("Getting public IP address of this machine");
		URL awsCheckIpURL = new URL("http://checkip.amazonaws.com");
		HttpURLConnection awsCheckIphttpUrlConnection = (HttpURLConnection) awsCheckIpURL.openConnection();
		BufferedReader awsCheckIpReader = new BufferedReader(new InputStreamReader(awsCheckIphttpUrlConnection.getInputStream()));
		String thisMachinePublicIp = awsCheckIpReader.readLine();
		
		debugSOP("Authorizing public IP for this machine "+thisMachinePublicIp+" to security group "+SECURITY_GROUP+" for incoming tcp port "+DEFAULT_PROXY_PORT_TO_ASSIGN);
		IpRange ip_range = new IpRange()
			    .withCidrIp(thisMachinePublicIp+"/32");
		IpPermission ip_perm = new IpPermission()
		    .withIpProtocol("tcp")
		    .withToPort(DEFAULT_PROXY_PORT_TO_ASSIGN)
		    .withFromPort(DEFAULT_PROXY_PORT_TO_ASSIGN)
		    .withIpv4Ranges(ip_range);
		AuthorizeSecurityGroupIngressRequest auth_request = new
		    AuthorizeSecurityGroupIngressRequest()
		        .withGroupName(SECURITY_GROUP)
		        .withIpPermissions(ip_perm);
		try
		{
			AuthorizeSecurityGroupIngressResult auth_response =
			    ec2.authorizeSecurityGroupIngress(auth_request);
			debugSOP(" * Result "+auth_response.toString());
		}
		catch(AmazonEC2Exception e)
		{
			if(e.getMessage().contains("already exists"))
				debugSOP(" * Already associated");
			else
			{
				throw e;
			}
		}
		
		debugSOP("Sleeping for 120 seconds to allow EC2 instance(s) to get up to speed.");
		Thread.sleep(120000);

		return result;
	}

	public static void debugSOP(String str)
	{
		System.out.println("[AWSProxyUtil] "+str);
	}
}

An important note on cost! If you allocate and release Elastic IPs too many times, AWS starts charging you (I think the first couple hundred(?) are free, but after that they start charging and it can add up!). And there is also a cost for leaving an Elastic IP address allocated.

Java: Workaround for Array.sort() slowness when sorting on File.lastModified()

Let’s say you have a File[] array gotten using File.listFiles() (or any other means). Now you want to sort that array based on the last modified date of the files. You could whip up the following code:

File directory = new File("/SomeDirectory");
File[] filesList = directory.listFiles();
Arrays.sort(filesList, new Comparator<File>() {
    public int compare(File file1, File file2)
    {
    	return Long.valueOf(file1.lastModified()).compareTo(file2.lastModified());
    } 
});

Note: this will sort them with the latest modified files first.

So this is all well and good, but let’s say your directory has 5 million files in it. Turns out the code above will be extremely slow in sorting the array on such a large list of files (also depending on the speed of your disk drive). The reason for that is because File.lastModified() is called on each file, every time a comparison is made during the sort. Arrays.sort() is an O(n log(n)) operation, so you do the math to see how many times File.lastModified() will be called on each individual file repeatedly in the worst case. (The issue with the repeated File.lastModified() calls is that the method does not cache the last modified timestamp; the call ventures out to the OS and the disk in real time to get the information every time.)

The way around this is simple. Cache the File.lastModified() timestamp. Here’s a code snippet on how to go about that:

public class FileLastModifiedWrapper implements Comparable<FileLastModifiedWrapper> 
{
	public final File file;
	public final long lastModified;

	public FileLastModifiedWrapper(File file) 
	{
		this.file = file;
		lastModified = file.lastModified();
	}

	public int compareTo(FileLastModifiedWrapper other) 
	{
		return Long.compare(this.lastModified, other.lastModified);
	}
}

//...somewhere else:

File directory = new File("/SomeDirectory");
File[] filesList = directory.listFiles();
FileLastModifiedWrapper[] wrappedFilesList = new FileLastModifiedWrapper[filesList.length];
for(int i=0; i<filesList.length; i++)
	wrappedFilesList[i] = new FileLastModifiedWrapper(filesList[i]);
Arrays.sort(wrappedFilesList);
for(int i=0; i<filesList.length; i++)
	filesList[i] = wrappedFilesList[i].file;

And voila! This will sort immensely faster. I noted that on around 100k files, it took just a few seconds, whereas the original code took up to two minutes.

As you see, FileLastModifiedWrapper caches the lastModified timestamp locally. Then we instantiate an array of FileLastModifiedWrapper objects with each file in our filesList. We then sort this new array, and use it to rearrange the original array.