Commit Graph

1882 Commits

Author SHA1 Message Date
78f8e3215f Starting migration of QueueNodeHearbeat (#1742)
* Add support for etag and timestamp
Introducing EntityBase
Starting migration of QueueNodeHearbeat

* rename namespaces

* upgrade Microsoft.Azure.Functions.Worker to 1.6.0
Added support when name contains underscore tot the case converter

* Support for not renaming enum fields

* bug fixes

* Arm client created in the contructor
added null check
2022-04-05 20:08:46 +00:00
0c3d9fcad2 Add CLI command to download agent logs (#1723)
* It does some things

* Download logs from job config

* Lint

* Make mypy happy

* Update to handle the new logs path

* progress

* A job might not have logs set in config

* Mypy wanted a type annotation
2022-04-05 15:35:15 -04:00
be27d430cd Bump structopt from 0.3.25 to 0.3.26 in /src/agent (#1617)
Bumps [structopt](https://github.com/TeXitoi/structopt) from 0.3.25 to 0.3.26.
- [Release notes](https://github.com/TeXitoi/structopt/releases)
- [Changelog](https://github.com/TeXitoi/structopt/blob/master/CHANGELOG.md)
- [Commits](https://github.com/TeXitoi/structopt/compare/v0.3.25...v0.3.26)

---
updated-dependencies:
- dependency-name: structopt
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Marc Greisen <mgreisen@microsoft.com>
2022-04-04 15:12:26 -07:00
26911c82db Setting up the ORM (#1738)
* setting up orm

* Add unit tests

* Added support for object serilization
added tests

* moving stuff aroung

* Fix handling of enum flags cases
removing dependency to CaseExtension
Adding more tests

* add support for property rename

* caching EntityInfo

* remove access to environment variable

* formatting

* renaming dbName to columnName
2022-04-04 16:58:52 +00:00
fbff3fc4af Logging (#1737)
* Logging

* Doing dependency injection

* expose GetLoggers for better testing

Co-authored-by: stas <statis@microsoft.com>
2022-04-02 15:36:08 -07:00
f11e79de4b Starting work... - add http client - add package lock - set namespace to Microsoft.OneFuzz.Service - add static class to get environment variables (#1736)
Co-authored-by: stas <statis@microsoft.com>
2022-04-01 13:20:08 -07:00
9cbbf86e36 Add the setup state to the unavailable list (#1731)
* Add the setup state to the unavailable list

* Add the init state to the unavailable list

* Fix integration test

* Wrong import
2022-04-01 14:00:25 -04:00
20d3df0a11 Deploy dotnet Azure function alongside Python Azure function (#1733)
Co-authored-by: stas <statis@microsoft.com>
2022-04-01 09:42:06 -07:00
e2d554da13 ApiService solution in C# (#1734) 2022-04-01 01:14:12 +00:00
2ffeadfa35 switch to bicep template only and bicep refactor (#1732)
* switch to bicep template only and bicep refactor

* correct monitorAccount name

Co-authored-by: stas <statis@microsoft.com>
2022-03-31 13:01:02 -07:00
dc354cffe3 port arm template to bicep (#1724)
* port template to bicep

* Update src/deployment/azuredeploy.bicep

Co-authored-by: Teo Voinea <58236992+tevoinea@users.noreply.github.com>

* port template to bicep

* adding type annotation

* apply changes from #1679

Co-authored-by: stas <statis@microsoft.com>
Co-authored-by: Teo Voinea <58236992+tevoinea@users.noreply.github.com>
2022-03-31 08:18:44 -07:00
a2e87c6158 Upload logs of the agent (#1721)
* Setting the service side of the log management
- a log is created or reused when e create a job
- when scheduling the task we send the log location to the agent
The expected log structure looks like
{fuzzContainer}/logs/{job_id}/{task_id}/{machine_id}/1.log
2022-03-30 15:20:42 -07:00
7add51fd3a Log redirection, service side (#1727)
* Setting the service side of the log management
- a log is created or reused when e create a job
- when scheduling the task we send the log location to the agent
The expected log structure looks liek
{fuzzContainer}/logs/{job_id}/{task_id}/{machine_id}/1.log

* regenerate doces

* including job_id in the container name

* regenerating docs
removing bad doc file
2022-03-29 18:47:20 +00:00
424ffdb4b5 Adding auto scale via cli (#1717)
* Initial implementation for adding auto scale via cli

* Remove unused argument

* Remove unused import

* I had a 👻extra line👻
2022-03-29 09:50:07 -04:00
5e31ba5b18 Consolidating Log Analytics References & Definitions (#1679)
* Consolidating Log Analytics References & Definitons.

* Updating variable name.

* Adding vm insights var name.

* removing bad files.

* Bad file.

* Fixing var.

* Adding new variables for all resources names.

* Removing autoscale changes.

Co-authored-by: Hayley Call <Hayley.Call@microsoft.com>
2022-03-28 13:34:38 -07:00
24454e3681 pin click to fix black (#1726)
* pin click to fix black

* missed a couple

Co-authored-by: stas <statis@microsoft.com>
2022-03-28 12:06:13 -07:00
387d5446ab Bump num_cpus from 1.13.0 to 1.13.1 in /src/agent (#1548)
Bumps [num_cpus](https://github.com/seanmonstar/num_cpus) from 1.13.0 to 1.13.1.
- [Release notes](https://github.com/seanmonstar/num_cpus/releases)
- [Changelog](https://github.com/seanmonstar/num_cpus/blob/master/CHANGELOG.md)
- [Commits](https://github.com/seanmonstar/num_cpus/compare/v1.13.0...v1.13.1)

---
updated-dependencies:
- dependency-name: num_cpus
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Marc Greisen <mgreisen@microsoft.com>
2022-03-28 10:00:11 -07:00
ce03394376 Handle instance being destroyed before updating scaling protection (#1719)
* Handle instance being destroyed before updating scaling protection

* Fix bug where we release protection too early
2022-03-28 10:14:39 -04:00
5c418eeb36 Add autoscaling diagnostics (#1708)
* Initial attempt

* Adding diagnostics works

* 🧹

* lint

* I wish the linter could auto fix these issues

* Lint
2022-03-21 13:25:21 +00:00
9d3b0e0f92 Add Linux dynamic library checks using ldd/LD_DEBUG (#1718) 2022-03-17 17:26:31 -07:00
9000a234e9 Add crate to debug missing dynamic libraries on Windows (#1713)
Add a CLI tool and library code to debug missing dynamic library errors on Windows.

The implementation manually edits the registry global flags for an image file to temporarily enable loader snaps, runs the target under our custom debugger to collect the debug output strings, then parses them for informative loading errors. It does not depend on the presence of `gflags.exe`.

This detects both dynamic linking (and thus process startup) errors, as well as dynamic loading (`LoadLibrary`) errors. It can report multiple missing dynamically-linked libraries.
2022-03-16 16:11:26 -07:00
194e7d0e9e Move the event grid topic creation and subscription to the deployment template (#1591)
* move the event grid subscription to the template

* change the name of the new subscription to prevent deleting the wrong subscription

* refactoring

* mypy fix

* format

* format

* remove old event grid before arm deployment

* fix deply

* attempt to fix check-pr issue

* fix interactive login in check-pr

* move the event grid subscription to the tempalte

* change the name of the new subscription to prevent deleting the wrong subscription

* refactoring

* mypy fix

* format

* format

* remove old event grid before arm deployment

* using resource Id

* fix type

* fix location

* revert changes in registration.py

* build fix attempt

* build fix

* revert ci changes

* remove file

* address comment

* address PR comments

* naming

* fix deplyment
2022-03-15 10:48:42 -07:00
ce36de72e7 Generate debuginfo for windows-libfuzzer-load-library test target (#1684) 2022-03-11 09:40:51 -08:00
c1ffec32f4 Reduce log level in ASAN parsing (#1705) 2022-03-10 14:03:58 -08:00
d4c92d497f Add type definition (#1703) 2022-03-09 23:52:06 +00:00
40b0e6685a Give function app resource group scoped contributor role (#1698)
* Give function app resource group scoped contributor role

* Reenable autoscaling

* We don't know what the minimum capacity for a sku is yet

* Lint
2022-03-09 13:07:22 -05:00
fa8589a3d6 Bump backoff from 0.3.0 to 0.4.0 in /src/agent, resolve dependabot block (#1589)
* Inital changes needed for backoff 0.4 to work

* Update backoff versions, fix BackoffError:Transient fields, other uses

* Format

* Removed redundant field name

* Improved backoff update changes

* Update backoff update

* Revert

* Changed to using Error::transient function
2022-03-08 16:07:07 -08:00
b22955de9b Upgrade regex crate (#1699)
* Bump to non vulnerable regex version

* Removed unnecessary cargo.toml change
2022-03-08 19:17:29 +00:00
4d1c1f5713 Abstract out node disposal (#1686)
* Abstract node disposal strategy

* Cleanup + lint

* Handle possibile scalesets being in resize state

* Setting the size is still exposed via CLI, we don't want to break that functionality

* PR comments
2022-03-08 13:30:34 -05:00
7c507ab7c7 Remove dependency on onefuzz deployment role to unblock (#1693) 2022-03-04 18:51:57 +00:00
e2bd878f59 Release 5.1.0 (#1681)
* Release 5.1.0

* Update CHANGELOG.md

Co-authored-by: Joe Ranweiler <joe@lemma.co>

* Update CHANGELOG.md

Co-authored-by: Joe Ranweiler <joe@lemma.co>

* Update CHANGELOG.md

Co-authored-by: Joe Ranweiler <joe@lemma.co>

* Prevent deletion of the repro VM on failure for debugging.

Co-authored-by: Joe Ranweiler <joe@lemma.co>
2022-03-03 15:55:36 -08:00
c9f7dd51f7 Clippy fix fell through the cracks (#1690)
* Clippy and fmt

* Compare versions

* Move the version block up

* and_then is the function we're looking for here
2022-03-03 16:37:04 -05:00
cde7602553 Fixes clippy lint (#1687)
* Fix lint

* Cargo fmt

* more lints
2022-03-03 11:50:39 -05:00
50bac07d19 increase the polling period in functional tests (#1682)
* increase the polling period

* update timeout to 30s

* format
2022-03-01 18:10:02 -08:00
d260689233 Remove use of deprecated warn() method on logger object (#1641)
Remove use of deprecated `warn()` method on logger object.

Co-authored-by: Marc Greisen <marc@greisen.org>
2022-03-01 09:19:11 -08:00
0a6b5898bc Add timeout to setup scripts (#1659)
Add a setup script-specific timeout of 59 minutes. This is just shorter than the service-side `NODE_EXPIRATION_TIME` which otherwise garbage collects nodes whose setup scripts are stuck or taking too long.

With this change, the high-level cause of the timeout is clear, instead of the closest error being something indirect, like "node reimaged during task execution".
2022-02-28 23:00:39 -08:00
1b019818b5 Fail fast if managed task workers are near-OOM (#1657)
- Add `onefuzz::memory::available_bytes()` to enable checking system-wide memory usage
- In managed task worker runs, heuristically check for imminent OOM conditions and try to exit early
2022-02-28 21:36:52 -08:00
f918299df0 Force -runs=1 when invoking in repro mode (#1651)
Co-authored-by: Marc Greisen <marc@greisen.org>
2022-02-28 12:23:18 -08:00
16166e1c14 Create autoscale resources for scaleset (#1661)
* Initial progress to adding a auto scale resource

* auto scale API is ready

* When creating a scaleset, add an autoscale resource to it as well

* Auto scale is correctly linked with scaleset

* 🧹

* Lint

* Cleaned up
2022-02-28 17:28:31 +00:00
40efa9ae1b Unescape blob name in BlobUrl (#1673)
* Unescape blob name in BlobUrl
2022-02-23 14:31:27 -08:00
674444b7d7 Split integration tests into different steps (#1650)
Refactoring check-pr.py to extract the logic of downloading the binaries
refactoring integration-tets.py to split the logic of setup, launch, check_result and cleanup
2022-02-22 22:33:00 +00:00
5d8516bd70 Enable scale in protection on VMSS instances (#1647)
* draft attempt at adding scaling protection

* Service can now control scaling protection policy on VM instances

* Improve logging a bit

* draft attempt at adding scaling protection

* Service can now control scaling protection policy on VM instances

* Improve logging a bit

* Error message was missing info

* Linter

* Don't schedule work if we can't protect the node

* Last of the linter changes
2022-02-14 14:56:55 +00:00
77dcd57b46 Add EventGrid compatible webhook format (#1640) 2022-02-11 16:39:19 -08:00
65fd48b31b Fix typo (#1625) 2022-02-10 13:15:48 -05:00
b7fc35ffea Update yanked block-buffer from 0.10.0 to 0.10.1 (#1648)
* Update yanked block-buffer from 0.10.0 to 0.10.1

* update yanked crossbeam-utils from 0.8.5 to 0.8.7

* update yanked crossbeam-utils from 0.8.5 to 0.8.7

Co-authored-by: stas <statis@microsoft.com>
2022-02-08 08:02:39 -08:00
62731f3836 bump azure-cli-core and azure-cli to 2.32.0 (#1634) 2022-02-02 17:38:46 -08:00
ee4dfe922f Allow authority to be specified in check-pr (#1473)
* fix interactive login in check-pr

* specify tenant domain in check-pr

* bug fix

* rename tenant to authority
2022-02-02 16:06:42 -08:00
809db31186 fix node serialization (#1627)
* fix node serialization

* remove intermediate variable

* mypy fix
2022-01-31 18:05:28 +00:00
dab0dba34f Fix lint warning (#1631) 2022-01-31 15:59:20 +00:00
4e6f496e06 bump azure-mgmt-subscription to 3.0.0 (#1606) 2022-01-28 18:57:12 -08:00