es-connectors-github.md 16 KB


navigation_title: "GitHub" mapped_pages:

Elastic GitHub connector reference [es-connectors-github]

The Elastic GitHub connector is a connector for GitHub. This connector is written in Python using the Elastic connector framework.

View the source code for this connector (branch main, compatible with Elastic 9.0).

::::{important} As of Elastic 9.0, managed connectors on Elastic Cloud Hosted are no longer available. All connectors must be self-managed. ::::

Self-managed connector [es-connectors-github-connector-client-reference]

Availability and prerequisites [es-connectors-github-client-availability-prerequisites]

This connector is available as a self-managed connector.

This self-managed connector is compatible with Elastic versions 8.10.0+.

To use this connector, satisfy all self-managed connector requirements.

Create a GitHub connector [es-connectors-github-create-connector-client]

Use the UI [es-connectors-github-client-create-use-the-ui]

To create a new GitHub connector:

  1. In the Kibana UI, navigate to the Search → Content → Connectors page from the main menu, or use the global search field.
  2. Follow the instructions to create a new GitHub self-managed connector.

Use the API [es-connectors-github-client-create-use-the-api]

You can use the {{es}} Create connector API to create a new self-managed GitHub self-managed connector.

For example:

PUT _connector/my-github-connector
{
  "index_name": "my-elasticsearch-index",
  "name": "Content synced from GitHub",
  "service_type": "github"
}

:::::{dropdown} You’ll also need to create an API key for the connector to use. ::::{note} The user needs the cluster privileges manage_api_key, manage_connector and write_connector_secrets to generate API keys programmatically.

::::

To create an API key for the connector:

  1. Run the following command, replacing values where indicated. Note the encoded return values from the response:

    POST /_security/api_key
    {
      "name": "connector_name-connector-api-key",
      "role_descriptors": {
        "connector_name-connector-role": {
          "cluster": [
            "monitor",
            "manage_connector"
          ],
          "indices": [
            {
              "names": [
                "index_name",
                ".search-acl-filter-index_name",
                ".elastic-connectors*"
              ],
              "privileges": [
                "all"
              ],
              "allow_restricted_indices": false
            }
          ]
        }
      }
    }
    
  2. Update your config.yml file with the API key encoded value.

:::::

Refer to the {{es}} API documentation for details of all available Connector APIs.

Usage [es-connectors-github-client-usage]

To use this connector as a self-managed connector, see Self-managed connectors For additional usage operations, see Connectors UI in {{kib}}.

GitHub personal access token [es-connectors-github-client-personal-access-token]

Configure a GitHub personal access token to fetch data from GitHub.

Follow these steps to generate a GitHub access token:

  • Go to GitHub Settings → Developer settings → Personal access tokens → Tokens(classic).
  • Select Generate new token.
  • Add a note and select the following scopes:

    • repo
    • user
    • read:org
  • Select Generate token and copy the token.

GitHub App [es-connectors-github-client-github-app]

Configure a GitHub App to fetch data from GitHub.

Follow these steps to create a GitHub App:

  • Go to GitHub Settings → Developer settings → GitHub Apps.
  • Select New GitHub App.
  • Add a name and Homepage URL, deselect Active under Webhook.
  • Under Permissions, select Read-only for Commit statuses, Contents, Issues, Metadata and Pull requests under Repository permissions, select Read-only for Members under Organization permissions.
  • Select Any account for Where can this GitHub App be installed?.
  • Click Create GitHub App.
  • Scroll down to the section Private keys, and click Generate a private key.
  • Click Install App in the upper-left corner, select the organizations/personal accounts you want to install the GitHub App on, click Install.
  • You can choose to install it on all repositories or selected repositories, and click Install.

Compatibility [es-connectors-github-client-compatability]

Both GitHub and GitHub Enterprise are supported.

Configuration [es-connectors-github-client-configuration]

The following configuration fields are required:

data_source : GitHub Cloud or GitHub Server.

host : URL of the GitHub Server instance. (GitHub Server only)

auth_method : The method to authenticate the GitHub instance. Toggle between Personal access token and GitHub App.

token : GitHub personal access token to authenticate the GitHub instance. This field is only available for Personal access token authentication method.

repo_type : Toggle between Organization and Other. Note that document level security (DLS) is only available for Organization repositories.

org_name : Name of the organization to fetch data from. This field is only available when Authentication method is set to Personal access token and Repository Type is set to Organization.

app_id : App ID of the GitHub App. This field is only available when Authentication method is set to GitHub App.

private_key : Private key generated for the GitHub App. This field is only available when Authentication method is set to GitHub App.

repositories : Comma-separated list of repositories to fetch data from GitHub instance. If the value is * the connector will fetch data from all repositories present in the configured user’s account.

Default value is `*`.

Examples:

* `elasticsearch`,`elastic/kibana`
* `*`

::::{tip} Repository ownership

If the "OWNER/" portion of the "OWNER/REPO" repository argument is omitted, it defaults to the name of the authenticating user.

In the examples provided here:

  • the elasticsearch repo synced will be the <OWNER>/elasticsearch
  • the kibana repo synced will be the Elastic owned repo

The "OWNER/" portion of the "OWNER/REPO" repository argument must be provided when GitHub App is selected as the Authentication method.

::::

::::{note} This field can be bypassed by advanced sync rules.

::::

ssl_enabled : Whether SSL verification will be enabled. Default value is False.

ssl_ca : Content of SSL certificate. Note: If ssl_enabled is False, the value in this field is ignored. Example certificate:

```txt
-----BEGIN CERTIFICATE-----
MIID+jCCAuKgAwIBAgIGAJJMzlxLMA0GCSqGSIb3DQEBCwUAMHoxCzAJBgNVBAYT
...
7RhLQyWn2u00L7/9Omw=
-----END CERTIFICATE-----
```

use_document_level_security : Toggle to enable document level security (DLS). When enabled, full syncs will fetch access control lists for each document and store them in the _allow_access_control field. DLS is only available when Repository Type is set to Organization.

retry_count : The number of retry attempts after failed request to GitHub. Default value is 3.

use_text_extraction_service : Requires a separate deployment of the Elastic Text Extraction Service. Requires that pipeline settings disable text extraction. Default value is False.

Deployment using Docker [es-connectors-github-client-docker]

You can deploy the GitHub connector as a self-managed connector using Docker. Follow these instructions.

::::{dropdown} Step 1: Download sample configuration file Download the sample configuration file. You can either download it manually or run the following command:

curl https://raw.githubusercontent.com/elastic/connectors/main/config.yml.example --output ~/connectors-config/config.yml

Remember to update the --output argument value if your directory name is different, or you want to use a different config file name.

::::

::::{dropdown} Step 2: Update the configuration file for your self-managed connector Update the configuration file with the following settings to match your environment:

  • elasticsearch.host
  • elasticsearch.api_key
  • connectors

If you’re running the connector service against a Dockerized version of Elasticsearch and Kibana, your config file will look like this:

# When connecting to your cloud deployment you should edit the host value
elasticsearch.host: http://host.docker.internal:9200
elasticsearch.api_key: <ELASTICSEARCH_API_KEY>

connectors:
  -
    connector_id: <CONNECTOR_ID_FROM_KIBANA>
    service_type: github
    api_key: <CONNECTOR_API_KEY_FROM_KIBANA> # Optional. If not provided, the connector will use the elasticsearch.api_key instead

Using the elasticsearch.api_key is the recommended authentication method. However, you can also use elasticsearch.username and elasticsearch.password to authenticate with your Elasticsearch instance.

Note: You can change other default configurations by simply uncommenting specific settings in the configuration file and modifying their values.

::::

::::{dropdown} Step 3: Run the Docker image Run the Docker image with the Connector Service using the following command:

docker run \
-v ~/connectors-config:/config \
--network "elastic" \
--tty \
--rm \
docker.elastic.co/integrations/elastic-connectors:9.0.0 \
/app/bin/elastic-ingest \
-c /config/config.yml

::::

Refer to DOCKER.md in the elastic/connectors repo for more details.

Find all available Docker images in the official registry.

::::{tip} We also have a quickstart self-managed option using Docker Compose, so you can spin up all required services at once: Elasticsearch, Kibana, and the connectors service. Refer to this README in the elastic/connectors repo for more information.

::::

Documents and syncs [es-connectors-github-client-documents-syncs]

The connector syncs the following objects and entities:

  • Repositories
  • Pull Requests
  • Issues
  • Files & Folder

Only the following file extensions are ingested:

  • .markdown
  • .md
  • .rst

::::{note}

  • Content of files bigger than 10 MB won’t be extracted.
  • Permissions are not synced. All documents indexed to an Elastic deployment will be visible to all users with access to that Elasticsearch Index.

::::

Sync types [es-connectors-github-client-sync-types]

Full syncs are supported by default for all connectors.

This connector also supports incremental syncs.

Sync rules [es-connectors-github-client-sync-rules]

Basic sync rules are identical for all connectors and are available by default. For more information read Types of sync rule.

Advanced sync rules [es-connectors-github-client-sync-rules-advanced]

::::{note} A full sync is required for advanced sync rules to take effect.

::::

The following section describes advanced sync rules for this connector. Advanced sync rules are defined through a source-specific DSL JSON snippet.

The following sections provide examples of advanced sync rules for this connector.

$$$es-connectors-github-client-sync-rules-advanced-branch$$$ Indexing document and files based on branch name configured via branch key

[
  {
    "repository": "repo_name",
    "filter": {
      "branch": "sync-rules-feature"
    }
  }
]

$$$es-connectors-github-client-sync-rules-advanced-issue-key$$$ Indexing document based on issue query related to bugs via issue key

[
  {
    "repository": "repo_name",
    "filter": {
      "issue": "is:bug"
    }
  }
]

$$$es-connectors-github-client-sync-rules-advanced-pr-key$$$ Indexing document based on PR query related to open PR’s via PR key

[
  {
    "repository": "repo_name",
    "filter": {
      "pr": "is:open"
    }
  }
]

$$$es-connectors-github-client-sync-rules-advanced-issue-query-branch-name$$$ Indexing document and files based on queries and branch name

[
  {
    "repository": "repo_name",
    "filter": {
      "issue": "is:bug",
      "pr": "is:open",
      "branch": "sync-rules-feature"
    }
  }
]

::::{note} All documents pulled by a given rule are indexed regardless of whether the document has already been indexed by a previous rule. This can lead to document duplication, but the indexed documents count will differ in the logs. Check the Elasticsearch index for the actual document count.

::::

$$$es-connectors-github-client-sync-rules-advanced-overlapping$$$ Advanced rules for overlapping

[
  {
    "filter": {
      "pr": "is:pr is:merged label:auto-backport merged:>=2023-07-20"
    },
    "repository": "repo_name"
  },
  {
    "filter": {
      "pr": "is:pr is:merged label:auto-backport merged:>=2023-07-15"
    },
    "repository": "repo_name"
  }
]

::::{note} If GitHub App is selected as the authentication method, the "OWNER/" portion of the "OWNER/REPO" repository argument must be provided.

::::

Content Extraction [es-connectors-github-client-content-extraction]

See Content extraction.

Self-managed connector operations [es-connectors-github-client-connector-client-operations]

End-to-end testing [es-connectors-github-client-testing]

The connector framework enables operators to run functional tests against a real data source. Refer to Connector testing for more details.

To perform E2E testing for the GitHub connector, run the following command:

$ make ftest NAME=github

For faster tests, add the DATA_SIZE=small flag:

make ftest NAME=github DATA_SIZE=small

Known issues [es-connectors-github-client-known-issues]

There are currently no known issues for this connector. Refer to Known issues for a list of known issues for all connectors.

Troubleshooting [es-connectors-github-client-troubleshooting]

See Troubleshooting.

Security [es-connectors-github-client-security]

See Security.