twistedtaya.blogg.se - Splunk dedup command

#SPLUNK DEDUP COMMAND CODE#

The keepevents= argument is not supported in SPL2. Use the sort command before the dedup command if you want to change the order of the events, which dictates which event is kept when the dedup command is run.Īlternative: If you are using the from command, you can specify the ORDER BY clause instead of using the sort command. The sortby argument is not supported in SPL2. In SPL2, the list of fields must be comma-delimited. In SPL2, command options must be specified before the. This performance behavior also applies to any field with high cardinality and large size.ĭifferences between SPL and SPL2 Command options must be specified first If you search the _raw field, the text of every event in memory is retained which impacts your search performance. I wanted to include this as part of the question so it's clear what the end state needs to look like, in case something needs to change somewhere in the grouping and sorting section to make this easier.Avoid using the dedup command on the _raw field if you are searching over a large volume of data. I've used stats delim="','" and mvcombine with some success at this point in the query to get results that finally look like this. The output of this query will also go through some additional translation to be used in our audit system, which takes a list of keys, each wrapped in single quotes and comma-delimited. As far as I can see in the search reference for the dedup command, it has no way to make it case insensitive, so the only solution I see is before doing a dedup on username, change it to lowercase: eval usernamelower (username) View solution in original post.

#SPLUNK DEDUP COMMAND CODE#

What I want to see is this - only the top 2 for each client type, sorted by time descending within each group clientType key _timeįor some "what I tried", I've tried using some query code in various orders mostly revolving around stats list(key), sort 0 -_time etc, with various "by" clauses. The output of this is to be used for certain audit purposes, and What I've found is that when I extend the search to multiple days (returning > 10,000 events), the output is erratic, and I see results that are out of order, not the most recent, or otherwise askew. This function processes field values as strings.

The first functions works best when the search includes the sort command immediately before the statistical or charting command.

To locate the first value based on time order, use the earliest function instead. but some clientTypes are not very frequent and we need to see the most recent of those as well. You can use this function with the stats, streamstats, and timechart commands. Now, I've been able to get this working on a smaller scale, say 1 day. I've starred the records that should end up in the output. While some translation is done before the step I'm asking about, I have the data looking like this as the output of the query (note: due to proprietary reasons I cannot provide the actual steps above this, nor is this real data but I think it should translate OK). (Really it's the top N - but more than "first" or "last") My customer wants to see the top 2 most recent "key" for each clientType. Command Line Arg Object vs Class Overloading vs Overriding Java String Java. I've been struggling with this query for a few hours and it seems that it should be fairly straightforward, but for some reason I'm finding it quite difficult.