Browse Source

Upgrade Azure Storage client to 4.0.0

We are using `2.0.0` today but Azure team now recommends:

```xml
<dependency>
    <groupId>com.microsoft.azure</groupId>
    <artifactId>azure-storage</artifactId>
    <version>4.0.0</version>
</dependency>
```

This new version fix the timeout issues we have seen with azure storage although #15080 adds a timeout support.
Azure storage client 2.0.0 was not passing correctly this value when it was calling Azure services.

Note that the timeout is a server side timeout and not client side timeout.
It means that it will raise only a timeout when:

* upload of blob is complete
* if azure service is not able to process the blob (and store it) within a given time range.

In which case it will raise an exception which elasticsearch can deal with:

```
java.io.IOException
    at __randomizedtesting.SeedInfo.seed([91BC11AEF16E073F:6886FA5308FCE4D8]:0)
    at com.microsoft.azure.storage.core.Utility.initIOException(Utility.java:643)
    at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:444)
    at com.microsoft.azure.storage.blob.BlobOutputStream.access$000(BlobOutputStream.java:53)
    at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:388)
    at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:385)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: com.microsoft.azure.storage.StorageException: Operation could not be completed within the specified time.
    at com.microsoft.azure.storage.StorageException.translateException(StorageException.java:89)
    at com.microsoft.azure.storage.core.StorageRequest.materializeException(StorageRequest.java:305)
    at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:175)
    at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlockInternal(CloudBlockBlob.java:1006)
    at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlock(CloudBlockBlob.java:978)
    at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:438)
    ... 9 more
```

The following code was used to test this against Azure platform:

```java
public void testDumb() throws URISyntaxException, StorageException, IOException, InvalidKeyException {
    String connectionString = "MY-AZURE-STRING";

    CloudStorageAccount storageAccount = CloudStorageAccount.parse(connectionString);
    CloudBlobClient client = storageAccount.createCloudBlobClient();
    client.getDefaultRequestOptions().setTimeoutIntervalInMs(1000);
    CloudBlobContainer container = client.getContainerReference("dumb");
    container.createIfNotExists();
    CloudBlockBlob blob = container.getBlockBlobReference("blob");

    File sourceFile = File.createTempFile("sourceFile", ".tmp");

    try {
        int fileSize = 10000000;

        byte[] buffer = new byte[fileSize];
        Random random = new Random();
        random.nextBytes(buffer);

        logger.info("Generate local file");
        FileOutputStream fos = new FileOutputStream(sourceFile);
        fos.write(buffer);
        fos.close();
        logger.info("End generate local file");

        FileInputStream fis = new FileInputStream(sourceFile);

        logger.info("Start uploading");
        blob.upload(fis, fileSize);
        logger.info("End uploading");

    }
    finally {
        if (sourceFile.exists()) {
            sourceFile.delete();
        }
    }
}
```

With 2.0.0, the above code was not raising any exception. With 4.0.0, the exception is now thrown correctly.

The default timeout is 5 minutes. See https://github.com/Azure/azure-storage-java/blob/master/microsoft-azure-storage/src/com/microsoft/azure/storage/core/Utility.java#L352-L375

Closes #12567.

Release notes from 2.0.0:

 * Removed deprecated table AtomPub support.
 * Removed deprecated constructors which take service clients in favor of constructors which take credentials.
 * Added support for "Add" permissions on Blob SAS.
 * Added support for "Create" permissions on Blob and File SAS.
 * Added support for IP Restricted SAS and Protocol SAS.
 * Added support for Account SAS to all services.
 * Added support for Minute and Hour Metrics to FileServiceProperties and added support for File Metrics to CloudAnalyticsClient.
 * Removed deprecated startCopyFromBlob() on CloudBlob. Use startCopy() instead.
 * Removed deprecated Credentials and StorageKey classes. Please use the appropriate methods on StorageCredentialsAccountAndKey instead.

 * Fixed a bug in table where a select on a non-existent field resulted in a null reference exception if the corresponding field in the TableEntity was not nullable.
 * Fixed a bug in table where JsonParser was automatically closing the response stream before it was completely drained causing socket exhaustion.
 * Fixed a bug in StorageCredentialsAccountAndKey.updateKey(String) which prevented valid keys from being set.
 * Added CloudBlobContainer.listBlobs(final String, final boolean) method.
 * Fixed a bug in blob where using AccessConditions on block blob uploads larger than 64MB done with the upload* methods or block blob uploads done openOutputStream with would fail if the blob did not already exist.
 * Added support for setting a proxy per request. Proxy can be set on an OperationContext instance and will be used when that instance is passed to the request method.

 * Added support for SAS to the Azure File service.
 * Added support for Append Blob.
 * Added support for Access Control Lists (ACL) to File Shares.
 * Added support for getting and setting of CORS rules to File service.
 * Added support for ShareStats to File Shares.
 * Added support for copying an Azure File to another Azure File or a Block Blob asynchronously, and aborting Azure File copy operations asynchronously.
 * Added support for copying a Blob to an Azure File asynchronously.
 * Added support for setting a maximum quota property on a File Share.
 * Removed deprecated AuthenticationScheme and its getter and setter. In the future only SharedKey will be used.
 * Removed deprecated getter/setters for all request option properties on the service clients. Please use the default request options getter/setters instead.
 * Removed getSubDirectoryReference() for blob directories and file directories. Use getDirectoryReference() instead.
 * Removed getEntityClass() in TableQuery. Please use getClazzType() instead.
 * Added client-side verification for lease duration and break periods.
 * Deprecated the setters in table for timestamp as this property is only modifiable by the service.
 * Deprecated startCopyFromBlob() on CloudBlob. Use startCopy() instead.
 * Deprecated the Credentials and StorageKey classes. Please use the appropriate methods on StorageCredentialsAccountAndKey instead.
 * Deprecated constructors which take service clients in favor of constructors which take credentials.
 * Fixed a bug where the DateBackwardCompatibility flag was not applied if set on the CloudTableClient default request options.
 * Changed library behavior to retry all exceptions thrown when parsing a response object.
 * Changed behavior to stop removing query parameters passed in with the resource URI if that URI contains a SAS token. Some query parameters such as comp, restype, snapshot and api-version will still be removed.
 * Added support for logging StringToSign to SharedKey and SAS.
 * **Added a connect timeout to prevent hangs when establishing the network connection.**
 * **Made performance enhancements to the BlobOutputStream class.**

 * Fixed a bug where maximum execution time was ignored for file, queue, and table services.
 * **Changed the socket timeout to be set to the service side timeout plus 5 minutes when maximum execution time is not set.**
 * **Changed the socket timeout to default to 5 minutes rather than infinite when neither service side timeout or maximum execution time are set.**
 * Fixed a bug where MD5 was calculated for commitBlockList even though UseTransactionalMD5 was set to false.
 * Fixed a bug where selecting fields that did not exist returned an error rather than an EntityProperty with a null value.
 * Fixed a bug where table entities with a single quote in their partition or row key could be inserted but not operated on in any other way.

 * Fixed a bug for all listing API's where next() would sometimes throw an exception if hasNext() had not been called even if there were more elements to iterate on.
 * Added sequence number to the blob properties. This is populated for page blobs.
 * Creating a page blob sets its length property.
 * Added support for page blob sequence numbers and sequence number access conditions.
 * Fixed a bug in abort copy where the lease access condition was not sent to the service.
 * Fixed an issue in startCopyFromBlob where if the URI of the source blob contained certain non-ASCII characters they would not be encoded appropriately. This would result in Authorization failures.
 * Fixed a small performance issue in XML serialization.
 * Fixed a bug in BlobOutputStream and FileOutputStream where flush added data to a request pool rather than immediately committing it to the Azure service.
 * Refactored to remove the blob, queue, and file package dependency on table in the error handling code.
 * Added additional client-side logging for REST requests, responses, and errors.

Closes #15976.
David Pilato 9 years ago
parent
commit
7a42014909

+ 4 - 2
docs/plugins/repository-azure.asciidoc

@@ -64,8 +64,10 @@ cloud:
 
 `my_account1` is the default account which will be used by a repository unless you set an explicit one.
 
-You can set the timeout to use when making any single request. It can be defined globally, per account or both.
-Defaults to `5m`.
+You can set the client side timeout to use when making any single request. It can be defined globally, per account or both.
+It's not set by default which means that elasticsearch is using the
+http://azure.github.io/azure-storage-java/com/microsoft/azure/storage/RequestOptions.html#setTimeoutIntervalInMs(java.lang.Integer)[default value]
+set by the azure client (known as 5 minutes).
 
 [source,yaml]
 ----

+ 1 - 1
plugins/repository-azure/build.gradle

@@ -23,7 +23,7 @@ esplugin {
 }
 
 dependencies {
-  compile 'com.microsoft.azure:azure-storage:2.0.0'
+  compile 'com.microsoft.azure:azure-storage:4.0.0'
   compile 'org.apache.commons:commons-lang3:3.3.2'
 }
 

+ 0 - 1
plugins/repository-azure/licenses/azure-storage-2.0.0.jar.sha1

@@ -1 +0,0 @@
-b970c65a38da0569013e0c76de7c404f842496c2

+ 1 - 0
plugins/repository-azure/licenses/azure-storage-4.0.0.jar.sha1

@@ -0,0 +1 @@
+b31504f0fb3f9c4458ad053b426357a9b0df6e08

+ 1 - 1
plugins/repository-azure/src/main/java/org/elasticsearch/cloud/azure/storage/AzureStorageService.java

@@ -41,7 +41,7 @@ public interface AzureStorageService {
 
     final class Storage {
         public static final String PREFIX = "cloud.azure.storage.";
-        public static final Setting<TimeValue> TIMEOUT_SETTING = Setting.timeSetting("cloud.azure.storage.timeout", TimeValue.timeValueMinutes(5), false, Setting.Scope.CLUSTER);
+        public static final Setting<TimeValue> TIMEOUT_SETTING = Setting.timeSetting("cloud.azure.storage.timeout", TimeValue.timeValueSeconds(-1), false, Setting.Scope.CLUSTER);
         public static final Setting<String> ACCOUNT_SETTING = Setting.simpleString("repositories.azure.account", false, Setting.Scope.CLUSTER);
         public static final Setting<String> CONTAINER_SETTING = Setting.simpleString("repositories.azure.container", false, Setting.Scope.CLUSTER);
         public static final Setting<String> BASE_PATH_SETTING = Setting.simpleString("repositories.azure.base_path", false, Setting.Scope.CLUSTER);

+ 10 - 8
plugins/repository-azure/src/main/java/org/elasticsearch/cloud/azure/storage/AzureStorageServiceImpl.java

@@ -121,13 +121,15 @@ public class AzureStorageServiceImpl extends AbstractLifecycleComponent<AzureSto
         // only one mode per storage account can be active at a time
         client.getDefaultRequestOptions().setLocationMode(mode);
 
-        // Set timeout option. Defaults to 5mn. See cloud.azure.storage.timeout or cloud.azure.storage.xxx.timeout
-        try {
-            int timeout = (int) azureStorageSettings.getTimeout().getMillis();
-            client.getDefaultRequestOptions().setMaximumExecutionTimeInMs(timeout);
-        } catch (ClassCastException e) {
-            throw new IllegalArgumentException("Can not convert [" + azureStorageSettings.getTimeout() +
-                "]. It can not be longer than 2,147,483,647ms.");
+        // Set timeout option if the user sets cloud.azure.storage.timeout or cloud.azure.storage.xxx.timeout (it's negative by default)
+        if (azureStorageSettings.getTimeout().getSeconds() > 0) {
+            try {
+                int timeout = (int) azureStorageSettings.getTimeout().getMillis();
+                client.getDefaultRequestOptions().setTimeoutIntervalInMs(timeout);
+            } catch (ClassCastException e) {
+                throw new IllegalArgumentException("Can not convert [" + azureStorageSettings.getTimeout() +
+                    "]. It can not be longer than 2,147,483,647ms.");
+            }
         }
         return client;
     }
@@ -273,7 +275,7 @@ public class AzureStorageServiceImpl extends AbstractLifecycleComponent<AzureSto
         CloudBlockBlob blobSource = blob_container.getBlockBlobReference(sourceBlob);
         if (blobSource.exists()) {
             CloudBlockBlob blobTarget = blob_container.getBlockBlobReference(targetBlob);
-            blobTarget.startCopyFromBlob(blobSource);
+            blobTarget.startCopy(blobSource);
             blobSource.delete();
             logger.debug("moveBlob container [{}], sourceBlob [{}], targetBlob [{}] -> done", container, sourceBlob, targetBlob);
         }

+ 18 - 4
plugins/repository-azure/src/test/java/org/elasticsearch/cloud/azure/storage/AzureStorageServiceTests.java

@@ -27,6 +27,8 @@ import org.elasticsearch.test.ESTestCase;
 import java.net.URI;
 
 import static org.hamcrest.Matchers.is;
+import static org.hamcrest.Matchers.lessThan;
+import static org.hamcrest.Matchers.nullValue;
 
 public class AzureStorageServiceTests extends ESTestCase {
     final static Settings settings = Settings.builder()
@@ -126,18 +128,30 @@ public class AzureStorageServiceTests extends ESTestCase {
         AzureStorageServiceImpl azureStorageService = new AzureStorageServiceMock(timeoutSettings);
         azureStorageService.doStart();
         CloudBlobClient client1 = azureStorageService.getSelectedClient("azure1", LocationMode.PRIMARY_ONLY);
-        assertThat(client1.getDefaultRequestOptions().getMaximumExecutionTimeInMs(), is(10 * 1000));
+        assertThat(client1.getDefaultRequestOptions().getTimeoutIntervalInMs(), is(10 * 1000));
         CloudBlobClient client3 = azureStorageService.getSelectedClient("azure3", LocationMode.PRIMARY_ONLY);
-        assertThat(client3.getDefaultRequestOptions().getMaximumExecutionTimeInMs(), is(30 * 1000));
+        assertThat(client3.getDefaultRequestOptions().getTimeoutIntervalInMs(), is(30 * 1000));
     }
 
     public void testGetSelectedClientDefaultTimeout() {
         AzureStorageServiceImpl azureStorageService = new AzureStorageServiceMock(settings);
         azureStorageService.doStart();
         CloudBlobClient client1 = azureStorageService.getSelectedClient("azure1", LocationMode.PRIMARY_ONLY);
-        assertThat(client1.getDefaultRequestOptions().getMaximumExecutionTimeInMs(), is(5 * 60 * 1000));
+        assertThat(client1.getDefaultRequestOptions().getTimeoutIntervalInMs(), nullValue());
         CloudBlobClient client3 = azureStorageService.getSelectedClient("azure3", LocationMode.PRIMARY_ONLY);
-        assertThat(client3.getDefaultRequestOptions().getMaximumExecutionTimeInMs(), is(30 * 1000));
+        assertThat(client3.getDefaultRequestOptions().getTimeoutIntervalInMs(), is(30 * 1000));
+    }
+
+    public void testGetSelectedClientNoTimeout() {
+        Settings timeoutSettings = Settings.builder()
+            .put("cloud.azure.storage.azure.account", "myaccount")
+            .put("cloud.azure.storage.azure.key", "mykey")
+            .build();
+
+        AzureStorageServiceImpl azureStorageService = new AzureStorageServiceMock(timeoutSettings);
+        azureStorageService.doStart();
+        CloudBlobClient client1 = azureStorageService.getSelectedClient("azure", LocationMode.PRIMARY_ONLY);
+        assertThat(client1.getDefaultRequestOptions().getTimeoutIntervalInMs(), is(nullValue()));
     }
 
     /**