## Summary of the Pull Request
_What is this about?_
Due to our GDPR privacy requirements, we decided that it would be best to completely purge personal identifiable information from our AppInsights telemetry and logging. Instead of just removing all of the logging statements with personal info, I created a filter function that logs telemetry after it's been run through a recursive scrubbing function. This PR includes this new scrubbing function.
## PR Checklist
* [x] Applies to work item: #660
* [ ] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/onefuzz) and sign the CLI.
* [ ] Tests added/passed
* [ ] Requires documentation to be updated
* [x] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx
## Info on Pull Request
_What does this include?_
Includes changes to events.py in onefuzzlib. I've implemented functionality - log_event() - to recursively check Event structures for UserInfo before logging to AppInsights.
## Validation Steps Performed
I run local tests using a script I created with test events.
_How does someone test & validate?_
I can provide local testing script. If that is insufficient, I can write a unit test that will run against this code.
If a user manually deletes a scaleset managed by OneFuzz, then `get_vmss_size` returns None.
When this happens, `Scaleset.shutdown` generates an exception from the `logging.info` call on line 573.
This PR handles this edge condition.
If we login successfully, save the login data immediately. That way if users run a second command before the first one finishes, they only have to login once.
Filter coverage recording against human-readable, demangled symbols.
- Add custom demanglers for Itanium C++ mangling, rustc mangling, and MSVC decorated names
- Add a catch-all demangler that tries each known demangler against a raw symbol, in a fixed order
- Default to using the catch-all demangler in coverage recording
We try to implement a lowest common denominator across schemes: omit types and extra annotations, but preserve generic specializations, namespacing, and paths. Note that the omission of parameter types causes collisions in the face of ad hoc polymorphism. Consult the unit tests for examples.
Update the filter rule format and implementation to be simpler and user-predictable. In particular, we remove an accidental dependence of rule application on hash map iteration order.
- Add caching to symbol table-driven module disassembly on Linux.
- Add configurable regex-based filtering for coverage collection, by module and module-scoped symbol name.
Block coverage recording can be manually tested using the `block_coverage` example in the `coverage` crate. See `./block_coverage -h` for expected args.
The filter file is optional. The file format is JSON like this:
```json
{
"modules": {
"allow": [
"<module-path-regex-1>",
"<module-path-regex-2>",
]
},
"symbols": {
"<module-path-regex-1>": {
"allow": [
"<symbol-name-regex-1>",
"<symbol-name-regex-2>",
]
},
"<module-path-regex-2>": {
"deny": [
"<symbol-name-regex-3>",
"<symbol-name-regex-4>",
]
}
}
}
```
Closes#285.
In debugging the connection retry issues, I dug into this more.
Apparently, some of hyper's connection errors are not mapped to std::io::Error, rendering the existing downcast impl ineffective.
As such, this PR makes the following updates:
1. Any request that fails for what `reqwest` calls a `connection` error is considered transient.
2. Updates the retry notify code to use our `warn` macro such that the events show up in application insights.
3. Updates the unit test to demonstrate that failures by trying to connect to `http://localhost:81/`, which shouldn't be listening on any system.
4. Adds a simple unit test to verify with send_retry_default, connections to https://www.microsoft.com work
Fixes#263