Skip to content

Conversation

fearful-symmetry
Copy link
Contributor

Change Summary

This is part of https://github.com/elastic/endpoint-dev/pull/15318, as we need to update the fields here in order for the EAF tests to pass.

The documentation here is a tad difficult so this is probably incomplete; I'm not sure what the difference is between custom_schemas and custom_documentation, and if I should just copy-and-paste fields between them?

Sample values

Sample document:

   {
        "agent": {
            "id": "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa",
            "type": "endpoint",
            "version": "9.0.0-SNAPSHOT"
        },
        "process": {
            "Ext": {
                "ancestry": [
                    "2FglUZ0Lr/dQXMFoO1Bvqg",
                    "ujM4C6D44yhXfnENvuGZnw",
                    "ReLpB+OskmxuUDOXsozgyA",
                    "TryIwW2T5O7gn1VvKhZgCQ",
                    "lGOe4YnopeNXWmCEnKJp3Q"
                ],
                "memfd": {
                    "flag_hugetlb": false,
                    "flag_allow_seal": true,
                    "flags": 2,
                    "name": "test_memfd",
                    "flag_exec": false,
                    "flag_cloexec": false,
                    "flag_noexec_seal": false
                }
            },
            "parent": {
                "real_user": {
                    "name": "alexk",
                    "id": 1000
                },
                "interactive": true,
                "start": "2024-11-21T16:53:03.18Z",
                "pid": 29994,
                "working_directory": "/home/alexk/endpoint-dev/Python",
                "entity_id": "2FglUZ0Lr/dQXMFoO1Bvqg",
                "executable": "/usr/bin/python3.12",
                "args": [
                    "/home/alexk/endpoint-dev/.venv/EAF/bin/python3",
                    "/home/alexk/endpoint-dev/.venv/EAF/bin/pytest",
                    "-v",
                    "endpoint/test/test_events.py::ProcessEventTests_0_capture_mode__ebpf::test_events_memfd_create"
                ],
                "name": "python3.12",
                "tty": {
                    "char_device": {
                        "major": 136,
                        "minor": 6
                    }
                },
                "real_group": {
                    "name": "alexk",
                    "id": 1000
                },
                "args_count": 4,
                "user": {
                    "name": "alexk",
                    "id": 1000
                },
                "command_line": "/home/alexk/endpoint-dev/.venv/EAF/bin/python3 /home/alexk/endpoint-dev/.venv/EAF/bin/pytest -v endpoint/test/test_events.py::ProcessEventTests_0_capture_mode__ebpf::test_events_memfd_create",
                "group": {
                    "name": "alexk",
                    "id": 1000
                },
                "supplemental_groups": [
                    {
                        "name": "wheel",
                        "id": 10
                    },
                    {
                        "name": "libvirt",
                        "id": 985
                    },
                    {
                        "name": "docker",
                        "id": 988
                    }
                ]
            },
            "group_leader": {
                "real_user": {
                    "name": "alexk",
                    "id": 1000
                },
                "interactive": true,
                "start": "2024-11-21T16:53:03.18Z",
                "pid": 29994,
                "working_directory": "/home/alexk/endpoint-dev/Python",
                "entity_id": "2FglUZ0Lr/dQXMFoO1Bvqg",
                "executable": "/usr/bin/python3.12",
                "args": [
                    "/home/alexk/endpoint-dev/.venv/EAF/bin/python3",
                    "/home/alexk/endpoint-dev/.venv/EAF/bin/pytest",
                    "-v",
                    "endpoint/test/test_events.py::ProcessEventTests_0_capture_mode__ebpf::test_events_memfd_create"
                ],
                "name": "python3.12",
                "tty": {
                    "char_device": {
                        "major": 136,
                        "minor": 6
                    }
                },
                "real_group": {
                    "name": "alexk",
                    "id": 1000
                },
                "args_count": 4,
                "same_as_process": false,
                "user": {
                    "name": "alexk",
                    "id": 1000
                },
                "group": {
                    "name": "alexk",
                    "id": 1000
                },
                "supplemental_groups": [
                    {
                        "name": "wheel",
                        "id": 10
                    },
                    {
                        "name": "libvirt",
                        "id": 985
                    },
                    {
                        "name": "docker",
                        "id": 988
                    }
                ]
            },
            "previous": [
                {
                    "args": [
                        "/home/alexk/endpoint-dev/.venv/EAF/bin/python3",
                        "/home/alexk/endpoint-dev/.venv/EAF/bin/pytest",
                        "-v",
                        "endpoint/test/test_events.py::ProcessEventTests_0_capture_mode__ebpf::test_events_memfd_create"
                    ],
                    "args_count": 4,
                    "executable": "/usr/bin/python3.12"
                }
            ],
            "real_user": {
                "name": "alexk",
                "id": 1000
            },
            "interactive": true,
            "start": "2024-11-21T16:54:21.88Z",
            "pid": 30545,
            "working_directory": "/home/alexk/endpoint-dev/Python",
            "entity_id": "5wL0HgFagbHMKoIHNuBegA",
            "executable": "/home/alexk/endpoint-dev/.venv/EAF/bin/python",
            "args": [
                "python",
                "-c",
                "import os; os.memfd_create('test_memfd', os.MFD_ALLOW_SEALING)"
            ],
            "session_leader": {
                "real_user": {
                    "name": "alexk",
                    "id": 1000
                },
                "interactive": true,
                "start": "2024-11-21T15:26:51.11Z",
                "pid": 25614,
                "working_directory": "/home/alexk/endpoint-dev/Python",
                "entity_id": "ujM4C6D44yhXfnENvuGZnw",
                "executable": "/usr/bin/zsh",
                "args": [
                    "/usr/bin/zsh",
                    "-i"
                ],
                "name": "zsh",
                "tty": {
                    "char_device": {
                        "major": 136,
                        "minor": 6
                    }
                },
                "real_group": {
                    "name": "alexk",
                    "id": 1000
                },
                "args_count": 2,
                "same_as_process": false,
                "user": {
                    "name": "alexk",
                    "id": 1000
                },
                "group": {
                    "name": "alexk",
                    "id": 1000
                },
                "supplemental_groups": [
                    {
                        "name": "wheel",
                        "id": 10
                    },
                    {
                        "name": "libvirt",
                        "id": 985
                    },
                    {
                        "name": "docker",
                        "id": 988
                    }
                ]
            },
            "entry_leader": {
                "parent": {
                    "start": "2024-11-20T15:01:44.02Z",
                    "pid": 1,
                    "entity_id": "AdtuAiv+v3CdZ92vB6q0RQ"
                },
                "real_user": {
                    "name": "alexk",
                    "id": 1000
                },
                "interactive": false,
                "start": "2024-11-21T15:05:51.8Z",
                "entry_meta": {
                    "type": "unknown"
                },
                "pid": 23494,
                "working_directory": "/home/alexk",
                "entity_id": "lGOe4YnopeNXWmCEnKJp3Q",
                "executable": "/usr/bin/bash",
                "args": [
                    "sh",
                    "/home/alexk/.vscode-server/cli/servers/Stable-e8653663e8840adaf45af01eab5c627a5af81807/server/bin/code-server",
                    "--connection-token=remotessh",
                    "--accept-server-license-terms",
                    "--start-server",
                    "--enable-remote-auto-shutdown",
                    "--socket-path=/tmp/code-25b9efc4-a1ec-4e3c-af48-8666aa4b847c"
                ],
                "name": "bash",
                "real_group": {
                    "name": "alexk",
                    "id": 1000
                },
                "args_count": 7,
                "same_as_process": false,
                "user": {
                    "name": "alexk",
                    "id": 1000
                },
                "group": {
                    "name": "alexk",
                    "id": 1000
                },
                "supplemental_groups": [
                    {
                        "name": "wheel",
                        "id": 10
                    },
                    {
                        "name": "libvirt",
                        "id": 985
                    },
                    {
                        "name": "docker",
                        "id": 988
                    }
                ]
            },
            "name": "python",
            "tty": {
                "char_device": {
                    "major": 136,
                    "minor": 6
                }
            },
            "real_group": {
                "name": "alexk",
                "id": 1000
            },
            "args_count": 3,
            "user": {
                "name": "alexk",
                "id": 1000
            },
            "command_line": "python -c import os; os.memfd_create('test_memfd', os.MFD_ALLOW_SEALING)",
            "hash": {
                "sha256": "79d8d6661062c882888095db44299a4f3c58a45819eb71b5823733b41d7c7d01"
            },
            "group": {
                "name": "alexk",
                "id": 1000
            },
            "supplemental_groups": [
                {
                    "name": "wheel",
                    "id": 10
                },
                {
                    "name": "libvirt",
                    "id": 985
                },
                {
                    "name": "docker",
                    "id": 988
                }
            ]
        },
        "@timestamp": "2024-11-21T16:54:21.8897838Z",
        "ecs": {
            "version": "8.10.0"
        },
        "data_stream": {
            "namespace": "default",
            "type": "logs",
            "dataset": "endpoint.events.process"
        },
        "elastic": {
            "agent": {
                "id": "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
            }
        },
        "host": {
            "os": {
                "type": "linux"
            },
            "name": "motmot",
            "id": "dabadaba-0000-0000-0000-000000000000"
        },
        "event": {
            "sequence": 1639,
            "created": "2024-11-21T16:54:21.8897838Z",
            "kind": "event",
            "module": "endpoint",
            "action": [
                "memfd_create"
            ],
            "id": "NnxZeOc0tCK3sTxG++++++TU",
            "category": [
                "process"
            ],
            "type": [
                "start"
            ],
            "dataset": "endpoint.events.process",
            "outcome": "unknown"
        },
        "message": "Endpoint process event",
        "user": {
            "Ext": {
                "real": {
                    "name": "alexk",
                    "id": 1000
                }
            },
            "name": "alexk",
            "id": 1000
        },
        "group": {
            "Ext": {
                "real": {
                    "name": "alexk",
                    "id": 1000
                }
            },
            "name": "alexk",
            "id": 1000
        }
    }

Release Target

Q/A

For mapping changes:

  • I ran make after making the schema changes, and committed all changes
  • If these field(s) are "exception"-able, I made a companion PR to Kibana adding it (see Readme)
  • If this is a metadata change, I also updated both transform destination schemas to match

@fearful-symmetry fearful-symmetry self-assigned this Nov 21, 2024
@pzl
Copy link
Member

pzl commented Nov 21, 2024

I'm not sure what the difference is between custom_schemas and custom_documentation

custom_documentation is effectively wiki-only information. @ferullo is there a public link to where this shows up / is rendered?

Is this PR meant to add or change any data stream mappings? Then you would also make changes in other folders.

  • custom_subsets/elastic_endpoint/<data-stream>: These map to the actual data streams shipped as part of the defend integration. Fields must be added here to be mapped. Fields can be sent by endpoint and retrieved in _source of a document without being mapped. We would only need to map if we intend to filter or search by these fields being added. If we are not searching by them, then perhaps just documenting them in custom_documentation is all we need.

If the field(s) we are looking at do need to be mapped, then they should be added above. If the field is part of ECS (~8.10) then just adding the field name to the above data stream should be enough. If the field is part of a newer ECS release, then we may need to do some bookkeeping to update our ECS reference.

If the field is not part of ECS (a pretty common case, for our workflow here), then the field definition goes in custom_schemas. That is where the field gets defined in the abstract (description, mapping data type, relevant mapping settings, etc). Those definitions go in some appropriate place there, usually based on its parent key path, somewhere appropriate.

So then in custom_subsets you just specify the name of the field, and the definition gets imported from the custom_schemas area.

But again, if we don't need to search on it, you do not have to map. Depends on the end use case here. if this is just getting that EAF error message to go away, then you might be done already.

Also please hold off on merging anything in this repo until #563 goes in. Just coincidental timing. edit: merged, you are unblocked

@ferullo
Copy link
Contributor

ferullo commented Nov 22, 2024

custom_documentation is effectively wiki-only information. @ferullo is there a public link to where this shows up / is rendered?

It is rendered in the doc directory right next to the src directory. It looks like the tool to do that rendering was run already for this PR.

@fearful-symmetry
Copy link
Contributor Author

@pzl thanks for you help. So, as far as I'm aware these aren't ECS fields, so it sounds like the changes also need to go in custom_schemas and the name goes in custom_subsets.

Is the info in package auto-generated?

@pzl
Copy link
Member

pzl commented Nov 22, 2024

so if these do need mapping for filtering/search, then yes, put the details about the fields in custom_schemas and then you can place that field in its data stream in custom_subsets.

Yes, package/ is pretty much entirely generated output from this flow above (with make)

@fearful-symmetry
Copy link
Contributor Author

@pzl Does that look right?

@pzl
Copy link
Member

pzl commented Nov 22, 2024

@fearful-symmetry stellar. Looks perfect. If you could add your sample values for memfd to the document in package/endpoint/data_stream/process/sample_event.json then I would say the PR is complete. That will check whatever sample values you add against the types in the mapping, during automated testing in this repo.

@fearful-symmetry fearful-symmetry marked this pull request as ready for review November 22, 2024 17:17
@fearful-symmetry fearful-symmetry requested review from a team as code owners November 22, 2024 17:17
@fearful-symmetry
Copy link
Contributor Author

@pzl I think we need a code owner review?

@pzl pzl self-requested a review November 22, 2024 17:43
@pzl
Copy link
Member

pzl commented Nov 22, 2024

Needs a review from someone on elastic-endpoint, who own the custom_documentation directory

@stanek-michal
Copy link
Contributor

@pzl thanks for you help. So, as far as I'm aware these aren't ECS fields, so it sounds like the changes also need to go in custom_schemas and the name goes in custom_subsets.

The memfd object/flags aren't in ECS but this stage0 RFC proposal (so still early/might be revised when aligning with OTEL) was merged which adds memfd_create as a new event type - linking it for reference:
elastic/ecs#2322
So my understanding is we can add anything we want to process.Ext to test and validate what we get from eBPF in the most granular way, but the data subset we will actually expose in ECS (also for other integration not just Defend) is TBD

Copy link
Contributor

@stanek-michal stanek-michal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate entry in one doc file, otherwise LGTM!

@fearful-symmetry fearful-symmetry merged commit 0d22496 into main Nov 25, 2024
4 checks passed
@pzl
Copy link
Member

pzl commented Dec 12, 2024

@fearful-symmetry FYI this is likely going to have to be backed out. A prerelease package version just went out, and kibana CI is failing with:

ERROR FormattedAxiosError: Request failed with status code 400: {"statusCode":400,"error":"Bad Request","message":"Error installing endpoint 8.18.0-prerelease.0: mapper_parsing_exception\n\tCaused by:\n\t\tmapper_parsing_exception: No handler for type [bool] declared on field [flag_hugetlb]\n\tRoot causes:\n\t\tmapper_parsing_exception: No handler for type [bool] declared on field [flag_hugetlb]"}

The recent merge of #555 into main triggered a prerelease build that included the changes from this PR.


Looking at the error, I think there is a simple fix here.

It should be type: boolean on your new fields instead of type: bool.

Absolutely fascinating that this wasn't caught in the elastic-package sample.json-based tests.

@pzl pzl mentioned this pull request Dec 12, 2024
1 task
@fearful-symmetry
Copy link
Contributor Author

@pzl ack! Sorry about that! Yeah, weird that CI didn't catch that...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants