Elastic GitHub connector reference

The Elastic GitHub connector is a connector for GitHub. This connector is written in Python using the Elastic connector framework.

View the source code for this connector (branch main, compatible with Elastic 9.0).

Important

As of Elastic 9.0, managed connectors on Elastic Cloud Hosted are no longer available. All connectors must be self-managed.

Toggle to enable document level security (DLS). When enabled, full syncs will fetch access control lists for each document and store them in the _allow_access_control field. DLS is only available when Repository Type is set to Organization.

retry_count

The number of retry attempts after failed request to GitHub. Default value is 3.

use_text_extraction_service

Requires a separate deployment of the Elastic Text Extraction Service. Requires that pipeline settings disable text extraction. Default value is False.

Deployment using Docker

You can deploy the GitHub connector as a self-managed connector using Docker. Follow these instructions.

Download the sample configuration file. You can either download it manually or run the following command:

		curl https://raw.githubusercontent.com/elastic/connectors/main/config.yml.example --output ~/connectors-config/config.yml

	

Remember to update the --output argument value if your directory name is different, or you want to use a different config file name.

Update the configuration file with the following settings to match your environment:

elasticsearch.host
elasticsearch.api_key
connectors

If you’re running the connector service against a Dockerized version of Elasticsearch and Kibana, your config file will look like this:

		# When connecting to your cloud deployment you should edit the host value
elasticsearch.host: http://host.docker.internal:9200
elasticsearch.api_key: <ELASTICSEARCH_API_KEY>

connectors:
  -
    connector_id: <CONNECTOR_ID_FROM_KIBANA>
    service_type: github
    api_key: <CONNECTOR_API_KEY_FROM_KIBANA>1

	

Optional. If not provided, the connector will use the elasticsearch.api_key instead

Using the elasticsearch.api_key is the recommended authentication method. However, you can also use elasticsearch.username and elasticsearch.password to authenticate with your Elasticsearch instance.

Note: You can change other default configurations by simply uncommenting specific settings in the configuration file and modifying their values.

Run the Docker image with the Connector Service using the following command:

		docker run \
-v ~/connectors-config:/config \
--network "elastic" \
--tty \
--rm \
docker.elastic.co/integrations/elastic-connectors:9.0.0 \
/app/bin/elastic-ingest \
-c /config/config.yml

	

Refer to DOCKER.md in the elastic/connectors repo for more details.

Find all available Docker images in the official registry.

Tip

We also have a quickstart self-managed option using Docker Compose, so you can spin up all required services at once: Elasticsearch, Kibana, and the connectors service. Refer to this README in the elastic/connectors repo for more information.

Documents and syncs

The connector syncs the following objects and entities:

Repositories
Pull Requests
Issues
Files & Folder

Only the following file extensions are ingested:

.markdown
.md
.rst

Note

Content of files bigger than 10 MB won’t be extracted.
Permissions are not synced. All documents indexed to an Elastic deployment will be visible to all users with access to that Elasticsearch Index.

Sync types

Full syncs are supported by default for all connectors.

This connector also supports incremental syncs.

Sync rules

Basic sync rules are identical for all connectors and are available by default. For more information read Types of sync rule.

Advanced sync rules

Note

A full sync is required for advanced sync rules to take effect.

The following section describes advanced sync rules for this connector. Advanced sync rules are defined through a source-specific DSL JSON snippet.

The following sections provide examples of advanced sync rules for this connector.

Indexing document and files based on branch name configured via branch key

		[
  {
    "repository": "repo_name",
    "filter": {
      "branch": "sync-rules-feature"
    }
  }
]

	

Indexing document based on issue query related to bugs via issue key

		[
  {
    "repository": "repo_name",
    "filter": {
      "issue": "is:bug"
    }
  }
]

	

Indexing document based on PR query related to open PR’s via PR key

		[
  {
    "repository": "repo_name",
    "filter": {
      "pr": "is:open"
    }
  }
]

	

Indexing document and files based on queries and branch name

		[
  {
    "repository": "repo_name",
    "filter": {
      "issue": "is:bug",
      "pr": "is:open",
      "branch": "sync-rules-feature"
    }
  }
]

	

Note

All documents pulled by a given rule are indexed regardless of whether the document has already been indexed by a previous rule. This can lead to document duplication, but the indexed documents count will differ in the logs. Check the Elasticsearch index for the actual document count.

Advanced rules for overlapping

		[
  {
    "filter": {
      "pr": "is:pr is:merged label:auto-backport merged:>=2023-07-20"
    },
    "repository": "repo_name"
  },
  {
    "filter": {
      "pr": "is:pr is:merged label:auto-backport merged:>=2023-07-15"
    },
    "repository": "repo_name"
  }
]

	

Note

If GitHub App is selected as the authentication method, the "OWNER/" portion of the "OWNER/REPO" repository argument must be provided.

$ make ftest NAME=github

For faster tests, add the DATA_SIZE=small flag:

make ftest NAME=github DATA_SIZE=small

Elastic GitHub connector reference

Self-managed connector

Availability and prerequisites

Create a GitHub connector

Use the UI

Use the API

Usage

GitHub personal access token

GitHub App

Compatibility

Configuration

Deployment using Docker

Documents and syncs

Sync types

Sync rules

Advanced sync rules

Content Extraction

Self-managed connector operations

End-to-end testing

Known issues

Troubleshooting

Security