testing-framework.asciidoc 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259
  1. [[testing-framework]]
  2. == Java Testing Framework
  3. [[testing-intro]]
  4. Testing is a crucial part of your application, and as information retrieval itself is already a complex topic, there should not be any additional complexity in setting up a testing infrastructure, which uses Elasticsearch. This is the main reason why we decided to release an additional file to the release, which allows you to use the same testing infrastructure we do in the Elasticsearch core. The testing framework allows you to setup clusters with multiple nodes in order to check if your code covers everything needed to run in a cluster. The framework prevents you from writing complex code yourself to start, stop or manage several test nodes in a cluster. In addition there is another very important feature called randomized testing, which you are getting for free as it is part of the Elasticsearch infrastructure.
  5. [[why-randomized-testing]]
  6. === why randomized testing?
  7. The key concept of randomized testing is not to use the same input values for every testcase, but still be able to reproduce it in case of a failure. This allows to test with vastly different input variables in order to make sure, that your implementation is actually independent from your provided test data.
  8. All of the tests are run using a custom junit runner, the `RandomizedRunner` provided by the randomized-testing project. If you are interested in the implementation being used, check out the http://labs.carrotsearch.com/randomizedtesting.html[RandomizedTesting webpage].
  9. [[using-elasticsearch-test-classes]]
  10. === Using the Elasticsearch test classes
  11. First, you need to include the testing dependency in your project, along with the Elasticsearch dependency you have already added. If you use maven and its `pom.xml` file, it looks like this
  12. [source,xml]
  13. --------------------------------------------------
  14. <dependencies>
  15. <dependency>
  16. <groupId>org.apache.lucene</groupId>
  17. <artifactId>lucene-test-framework</artifactId>
  18. <version>${lucene.version}</version>
  19. <scope>test</scope>
  20. </dependency>
  21. <dependency>
  22. <groupId>org.elasticsearch.test</groupId>
  23. <artifactId>framework</artifactId>
  24. <version>${elasticsearch.version}</version>
  25. <scope>test</scope>
  26. </dependency>
  27. </dependencies>
  28. --------------------------------------------------
  29. Replace the elasticsearch version and the lucene version with the corresponding elasticsearch version and its accompanying lucene release.
  30. We provide a few classes that you can inherit from in your own test classes which provide:
  31. * pre-defined loggers
  32. * randomized testing infrastructure
  33. * a number of helper methods
  34. [[unit-tests]]
  35. === unit tests
  36. If your test is a well isolated unit test which doesn't need a running Elasticsearch cluster, you can use the `ESTestCase`. If you are testing lucene features, use `ESTestCase` and if you are testing concrete token streams, use the `ESTokenStreamTestCase` class. Those specific classes execute additional checks which ensure that no resources leaks are happening, after the test has run.
  37. [[integration-tests]]
  38. === Integration tests
  39. These kind of tests require firing up a whole cluster of nodes, before the tests can actually be run. Compared to unit tests they are obviously way more time consuming, but the test infrastructure tries to minimize the time cost by only restarting the whole cluster, if this is configured explicitly.
  40. The class your tests have to inherit from is `ESIntegTestCase`. By inheriting from this class, you will no longer need to start Elasticsearch nodes manually in your test, although you might need to ensure that at least a certain number of nodes are up. The integration test behaviour can be configured heavily by specifying different system properties on test runs. See the `TESTING.asciidoc` documentation in the https://github.com/elastic/elasticsearch/blob/master/TESTING.asciidoc[source repository] for more information.
  41. [[number-of-shards]]
  42. ==== Number of shards
  43. The number of shards used for indices created during integration tests is randomized between `1` and `10` unless overwritten upon index creation via index settings.
  44. The rule of thumb is not to specify the number of shards unless needed, so that each test will use a different one all the time. Alternatively you can override the `numberOfShards()` method. The same applies to the `numberOfReplicas()` method.
  45. [[helper-methods]]
  46. ==== Generic helper methods
  47. There are a couple of helper methods in `ESIntegTestCase`, which will make your tests shorter and more concise.
  48. [horizontal]
  49. `refresh()`:: Refreshes all indices in a cluster
  50. `ensureGreen()`:: Ensures a green health cluster state, waiting for relocations. Waits the default timeout of 30 seconds before failing.
  51. `ensureYellow()`:: Ensures a yellow health cluster state, also waits for 30 seconds before failing.
  52. `createIndex(name)`:: Creates an index with the specified name
  53. `flush()`:: Flushes all indices in a cluster
  54. `flushAndRefresh()`:: Combines `flush()` and `refresh()` calls
  55. `forceMerge()`:: Waits for all relocations and force merges all indices in the cluster to one segment.
  56. `indexExists(name)`:: Checks if given index exists
  57. `admin()`:: Returns an `AdminClient` for administrative tasks
  58. `clusterService()`:: Returns the cluster service java class
  59. `cluster()`:: Returns the test cluster class, which is explained in the next paragraphs
  60. [[test-cluster-methods]]
  61. ==== Test cluster methods
  62. The `InternalTestCluster` class is the heart of the cluster functionality in a randomized test and allows you to configure a specific setting or replay certain types of outages to check, how your custom code reacts.
  63. [horizontal]
  64. `ensureAtLeastNumNodes(n)`:: Ensure at least the specified number of nodes is running in the cluster
  65. `ensureAtMostNumNodes(n)`:: Ensure at most the specified number of nodes is running in the cluster
  66. `getInstance()`:: Get a guice instantiated instance of a class from a random node
  67. `getInstanceFromNode()`:: Get a guice instantiated instance of a class from a specified node
  68. `stopRandomNode()`:: Stop a random node in your cluster to mimic an outage
  69. `stopCurrentMasterNode()`:: Stop the current master node to force a new election
  70. `stopRandomNonMaster()`:: Stop a random non master node to mimic an outage
  71. `buildNode()`:: Create a new Elasticsearch node
  72. `startNode(settings)`:: Create and start a new Elasticsearch node
  73. [[changing-node-settings]]
  74. ==== Changing node settings
  75. If you want to ensure a certain configuration for the nodes, which are started as part of the `EsIntegTestCase`, you can override the `nodeSettings()` method
  76. [source,java]
  77. -----------------------------------------
  78. public class Mytests extends ESIntegTestCase {
  79. @Override
  80. protected Settings nodeSettings(int nodeOrdinal) {
  81. return Settings.builder().put(super.nodeSettings(nodeOrdinal))
  82. .put("node.mode", "network")
  83. .build();
  84. }
  85. }
  86. -----------------------------------------
  87. [[accessing-clients]]
  88. ==== Accessing clients
  89. In order to execute any actions, you have to use a client. You can use the `ESIntegTestCase.client()` method to get back a random client. This client can be a `TransportClient` or a `NodeClient` - and usually you do not need to care as long as the action gets executed. There are several more methods for client selection inside of the `InternalTestCluster` class, which can be accessed using the `ESIntegTestCase.internalCluster()` method.
  90. [horizontal]
  91. `iterator()`:: An iterator over all available clients
  92. `masterClient()`:: Returns a client which is connected to the master node
  93. `nonMasterClient()`:: Returns a client which is not connected to the master node
  94. `clientNodeClient()`:: Returns a client, which is running on a client node
  95. `client(String nodeName)`:: Returns a client to a given node
  96. `smartClient()`:: Returns a smart client
  97. [[scoping]]
  98. ==== Scoping
  99. By default the tests are run with unique cluster per test suite. Of course all indices and templates are deleted between each test. However, sometimes you need to start a new cluster for each test - for example, if you load a certain plugin, but you do not want to load it for every test.
  100. You can use the `@ClusterScope` annotation at class level to configure this behaviour
  101. [source,java]
  102. -----------------------------------------
  103. @ClusterScope(scope=TEST, numDataNodes=1)
  104. public class CustomSuggesterSearchTests extends ESIntegTestCase {
  105. // ... tests go here
  106. }
  107. -----------------------------------------
  108. The above sample configures the test to use a new cluster for each test method. The default scope is `SUITE` (one cluster for all
  109. test methods in the test). The `numDataNodes` settings allows you to only start a certain number of data nodes, which can speed up test
  110. execution, as starting a new node is a costly and time consuming operation and might not be needed for this test.
  111. By default, the testing infrastructure will randomly start dedicated master nodes. If you want to disable dedicated masters
  112. you can set `supportsDedicatedMasters=false` in a similar fashion to the `numDataNodes` setting. If dedicated master nodes are not used,
  113. data nodes will be allowed to become masters as well.
  114. [[changing-node-configuration]]
  115. ==== Changing plugins via configuration
  116. As Elasticsearch is using JUnit 4, using the `@Before` and `@After` annotations is not a problem. However you should keep in mind, that this does not have any effect in your cluster setup, as the cluster is already up and running when those methods are run. So in case you want to configure settings - like loading a plugin on node startup - before the node is actually running, you should overwrite the `nodePlugins()` method from the `ESIntegTestCase` class and return the plugin classes each node should load.
  117. [source,java]
  118. -----------------------------------------
  119. @Override
  120. protected Collection<Class<? extends Plugin>> nodePlugins() {
  121. return Arrays.asList(CustomSuggesterPlugin.class);
  122. }
  123. -----------------------------------------
  124. [[randomized-testing]]
  125. === Randomized testing
  126. The code snippets you saw so far did not show any trace of randomized testing features, as they are carefully hidden under the hood. However when you are writing your own tests, you should make use of these features as well. Before starting with that, you should know, how to repeat a failed test with the same setup, how it failed. Luckily this is quite easy, as the whole mvn call is logged together with failed tests, which means you can simply copy and paste that line and run the test.
  127. [[generating-random-data]]
  128. ==== Generating random data
  129. The next step is to convert your test using static test data into a test using randomized test data. The kind of data you could randomize varies a lot with the functionality you are testing against. Take a look at the following examples (note, that this list could go on for pages, as a distributed system has many, many moving parts):
  130. * Searching for data using arbitrary UTF8 signs
  131. * Changing your mapping configuration, index and field names with each run
  132. * Changing your response sizes/configurable limits with each run
  133. * Changing the number of shards/replicas when creating an index
  134. So, how can you create random data. The most important thing to know is, that you never should instantiate your own `Random` instance, but use the one provided in the `RandomizedTest`, from which all Elasticsearch dependent test classes inherit from.
  135. [horizontal]
  136. `getRandom()`:: Returns the random instance, which can recreated when calling the test with specific parameters
  137. `randomBoolean()`:: Returns a random boolean
  138. `randomByte()`:: Returns a random byte
  139. `randomShort()`:: Returns a random short
  140. `randomInt()`:: Returns a random integer
  141. `randomLong()`:: Returns a random long
  142. `randomFloat()`:: Returns a random float
  143. `randomDouble()`:: Returns a random double
  144. `randomInt(max)`:: Returns a random integer between 0 and max
  145. `between()`:: Returns a random between the supplied range
  146. `atLeast()`:: Returns a random integer of at least the specified integer
  147. `atMost()`:: Returns a random integer of at most the specified integer
  148. `randomLocale()`:: Returns a random locale
  149. `randomTimeZone()`:: Returns a random timezone
  150. `randomFrom()`:: Returns a random element from a list/array
  151. In addition, there are a couple of helper methods, allowing you to create random ASCII and Unicode strings, see methods beginning with `randomAscii`, `randomUnicode`, and `randomRealisticUnicode` in the random test class. The latter one tries to create more realistic unicode string by not being arbitrary random.
  152. If you want to debug a specific problem with a specific random seed, you can use the `@Seed` annotation to configure a specific seed for a test. If you want to run a test more than once, instead of starting the whole test suite over and over again, you can use the `@Repeat` annotation with an arbitrary value. Each iteration than gets run with a different seed.
  153. [[assertions]]
  154. === Assertions
  155. As many Elasticsearch tests are checking for a similar output, like the amount of hits or the first hit or special highlighting, a couple of predefined assertions have been created. Those have been put into the `ElasticsearchAssertions` class. There is also a specific geo assertions in `ElasticsearchGeoAssertions`.
  156. [horizontal]
  157. `assertHitCount()`:: Checks hit count of a search or count request
  158. `assertAcked()`:: Ensure the a request has been acknowledged by the master
  159. `assertSearchHits()`:: Asserts a search response contains specific ids
  160. `assertMatchCount()`:: Asserts a matching count from a percolation response
  161. `assertFirstHit()`:: Asserts the first hit hits the specified matcher
  162. `assertSecondHit()`:: Asserts the second hit hits the specified matcher
  163. `assertThirdHit()`:: Asserts the third hit hits the specified matcher
  164. `assertSearchHit()`:: Assert a certain element in a search response hits the specified matcher
  165. `assertNoFailures()`:: Asserts that no shard failures have occurred in the response
  166. `assertFailures()`:: Asserts that shard failures have happened during a search request
  167. `assertHighlight()`:: Assert specific highlights matched
  168. `assertSuggestion()`:: Assert for specific suggestions
  169. `assertSuggestionSize()`:: Assert for specific suggestion count
  170. `assertThrows()`:: Assert a specific exception has been thrown
  171. Common matchers
  172. [horizontal]
  173. `hasId()`:: Matcher to check for a search hit id
  174. `hasType()`:: Matcher to check for a search hit type
  175. `hasIndex()`:: Matcher to check for a search hit index
  176. `hasScore()`:: Matcher to check for a certain score of a hit
  177. `hasStatus()`:: Matcher to check for a certain `RestStatus` of a response
  178. Usually, you would combine assertions and matchers in your test like this
  179. [source,java]
  180. ----------------------------
  181. SearchResponse searchResponse = client().prepareSearch() ...;
  182. assertHitCount(searchResponse, 4);
  183. assertFirstHit(searchResponse, hasId("4"));
  184. assertSearchHits(searchResponse, "1", "2", "3", "4");
  185. ----------------------------