In a previous commit, reimage_queued was added to prevent reimaging a node while it is reimaging. However, this means reimaging failures due to Azure issues don't finish reimaging.
This will reset the this flag allowing the node to reimage in the following cleanup cycle.
We're experiencing a bug where Unions of sub-models are getting downcast, which causes a loss of information.
As an example, EventScalesetCreated was getting downcast to EventScalesetDeleted. I have not figured out why, nor can I replicate it locally to minimize the bug send upstream, but I was able to reliably replicate it on the service.
While working through this issue, I noticed that deserialization of SignalR events was frequently wrong, leaving things like tasks as "init" in `status top`.
Both of these issues are Unions of models with a type field, so it's likely these are related.
this fixes an issue related to object id reuse that can occur making the
object identification cache fail. Instead, this simplifies the
hide_secrets to always recurse and use setattr to always set the value
based on the recursion.
Note, the object id reuse issue was seen in the
`events.filter_event_recurse` development and this was the fix for the
id reuse there.
Python documentation states:
id(object):
Return the “identity” of an object. This is an integer (or long integer)
which is guaranteed to be unique and constant for this object during its
lifetime. Two objects with non-overlapping lifetimes may have the same
id() value.
## Summary of the Pull Request
_What is this about?_
We'd like to refactor the proxy lifecycle to only delete when the proxy is out-of-date - i.e. when the proxy is older than 7 days or a mismatched version. I've changed two files, proxy.py and timer_daily\init.py to check for the version and timestamp before stopping a live proxy.
## PR Checklist
* [ ] Applies to work item: #xxx
* [ ] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/onefuzz) and sign the CLI.
* [ ] Tests added/passed
* [ ] Requires documentation to be updated
* [x] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx
## Info on Pull Request
_What does this include?_
Changes to two files:
proxy.py:
- get_or_create() edited to check if timestamp is >7 days.
- Created is_outdated() to check version and timestamp for out-of-date proxy.
timer_daily/init.py
- Proxy check now includes is_outdated() before determining if a proxy should be shutdown.
## Validation Steps Performed
Deploying test instance to determine if proxy lives past a single day.
Expose the Azure storage table's "Timestamp" for the models where Timestamp should be user-accessible and remove the Timestamp field from models that did not sign up for it.
The behavior where Timestamp is only set by Azure Storage is kept.
In practice, Application Insights can take up to 3 minutes before something sent to it is available via KQL.
This PR logs a start and stop marker such that the integration tests only search for logs during the integration tests. This reduces the complexity when using the integration tests during the development process.
Note: this migrated the new functionality from #356 into the latest integration test tools.
## Summary of the Pull Request
_What is this about?_
## PR Checklist
* [x] Applies to work item: #562
* [x] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/onefuzz) and sign the CLI.
* [x] Tests added/passed
* [ ] Requires documentation to be updated
* [x] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx
## Info on Pull Request
The end-to-end changes needed to have onefuzz deployed with multi-tenant authentication.
## Validation Steps Performed
_How does someone test & validate?_
## Summary of the Pull Request
_What is this about?_
Due to our GDPR privacy requirements, we decided that it would be best to completely purge personal identifiable information from our AppInsights telemetry and logging. Instead of just removing all of the logging statements with personal info, I created a filter function that logs telemetry after it's been run through a recursive scrubbing function. This PR includes this new scrubbing function.
## PR Checklist
* [x] Applies to work item: #660
* [ ] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/onefuzz) and sign the CLI.
* [ ] Tests added/passed
* [ ] Requires documentation to be updated
* [x] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx
## Info on Pull Request
_What does this include?_
Includes changes to events.py in onefuzzlib. I've implemented functionality - log_event() - to recursively check Event structures for UserInfo before logging to AppInsights.
## Validation Steps Performed
I run local tests using a script I created with test events.
_How does someone test & validate?_
I can provide local testing script. If that is insufficient, I can write a unit test that will run against this code.
If a user manually deletes a scaleset managed by OneFuzz, then `get_vmss_size` returns None.
When this happens, `Scaleset.shutdown` generates an exception from the `logging.info` call on line 573.
This PR handles this edge condition.