Azure Dev Ops – Job-Focused vs Task-Focused

This is the first post in the series, Azure Dev Ops Odds and Ends. The starting page has information about the code environment I’m working with, as well as links to other posts in this series.

The first step to consolidate builds into a single pipeline was to gather code from four repositories into a single location. From there, the projects could be built and deployed as a set. Based on our existing pipelines, I needed to create jobs, first to checkout code, before jobs to build and publish the solutions. Below is the start of the “BuildDACPACs” stage of the pipeline.

- stage: BuildDACPACs
  jobs:
  - job: BuildDACPACs
    displayName: Build and publish database DACPACs
    workspace:
      clean: all
    steps:
    # First step: check out the repositories with the source code and other scripts the pipeline needs.
    - checkout: git://AutomatedTestingPoC/AutomatedTestingPoC #Optional but included to gain access to artifacts in the "root" repo, such as PowerShell scripts needed in the deployment process
    - checkout: git://AutomatedTestingPoC/Staging
    - checkout: git://AutomatedTestingPoC/SystemLogging
    - checkout: git://AutomatedTestingPoC/Warehouse
    - checkout: git://AutomatedTestingPoC/AzureDataFactory
- job: BuildWarehouseDB
    displayName: Build Warehouse database project
    steps:
    # Second step: Build and publish the Warehouse database project 
    #   Two things:
    #     1) the DACPAC is buried, so it is being copied to a pre-defined 'ArtifactStagingDirectory' from the bin folder
    #     2) There are 2 settings files for publishing; one for Automated Testing and another for other environments.  Those need to be copied into the artifacts folder, along with the DACPAC.
    - task: VSBuild@1
      displayName: Build Warehouse Database
      inputs:
       solution: "$(Build.SourcesDirectory)/Warehouse/**/*.sln"
       platform: $(buildPlatform)
       configuration: $(buildConfiguration)
       clean: true
    - powershell: copy-item (get-childItem "$($env:Build_SourcesDirectory)\Warehouse" -Filter *.dacpac -Recurse | select fullname).FullName $($env:Build_ArtifactStagingDirectory)
      displayName: Copy Warehouse DACPAC
    - powershell: copy-item (get-childItem "$($env:Build_SourcesDirectory)\Warehouse" -Filter Default.publish.xml -Recurse | select fullname).FullName "$($env:Build_ArtifactStagingDirectory)\Warehouse.Publish.xml"
      displayName: Copy Warehouse Publish Settings XML
    - task: PublishBuildArtifacts@1
      displayName: Publish Warehouse Database
      inputs:
        PathtoPublish: '$(Build.ArtifactStagingDirectory)'
        ArtifactName: 'BlueMonkey_artifacts'
        publishLocation: 'Container'

The syntax of the YAML is good and the pipeline started to run. The checkouts worked as expected and there should be 4 directories with code in there. The Warehouse database build was first to start, and it failed on the build process itself. In Image #1, the error states the solution couldn’t be found in the folder path, ‘D:\a\1\s\Warehouse\**\*.sln’. Not being the most observant character, I was stumped and didn’t understand what was happening. After some digging, I realized the step before the build was doing exactly what it said it was doing. Imagine that!! I had checked out code earlier, but the Warehouse repository was being checked out again. Looking through the YAML above, there wasn’t an explicit step to check out the code again. Where is this being called?

Image #1 – Pipeline logs; Warehouse database failed

It turns out, when a job is defined, ADO performs a number of steps, including setting up an agent. This occurred for the first job to check out code, and it happened again to build the database. An automated step is to check out code from the repository in which the YAML file is defined. In order to not wipe out, we can’t use a job for each major process to build and deploy the solution. On line 5 of the YAML above, a clue was provided, by the workspace: clean option. That led me to dig deeper into what was happening and to review the logs in more detail.

Starting: Initialize job
Agent name: 'Hosted Agent'
Agent machine name: 'WIN-6C2TM2SQ98G'
Current agent version: '2.211.0'
Operating System
Runner Image
Runner Image Provisioner
Current image version: '20221002.2'
Agent running as: 'VssAdministrator'
Prepare build directory.
Set build variables.
Download all required tasks.
Downloading task: VSBuild (1.208.0)
Downloading task: PowerShell (2.210.0)
Downloading task: PublishBuildArtifacts (1.200.0)
Checking job knob settings.
   Knob: AgentToolsDirectory = C:\hostedtoolcache\windows Source: ${AGENT_TOOLSDIRECTORY} 
   Knob: AgentPerflog = C:\agents\perflog Source: ${VSTS_AGENT_PERFLOG} 
Finished checking job knob settings.
Start tracking orphan processes.
Finishing: Initialize job

Instead, we need to structure the YAML to run many tasks within a single job step. The build and publish tasks are now contained within a single job, BuildArtifacts. When the job is initialized, the current repository is still checked out, but the next step explicitly checks out all of the repositories, placing each one into its own subfolder, under $(Build.SourcesDirectory).

- stage: BuildArtifacts
  jobs:
  - job: BuildArtifacts
    displayName: Build and publish database DACPACs and ADF code
    workspace:
      clean: all
    steps:
    # First step: check out the repositories with the source code and other scripts the pipeline needs.
    - checkout: git://AutomatedTestingPoC/AutomatedTestingPoC #Optional but included to gain access to artifacts in the "root" repo, such as PowerShell scripts needed in the deployment process
    - checkout: git://AutomatedTestingPoC/Staging
    - checkout: git://AutomatedTestingPoC/SystemLogging
    - checkout: git://AutomatedTestingPoC/Warehouse
    - checkout: git://AutomatedTestingPoC/AzureDataFactory
    # Second step: Build and publish the Warehouse database project 
    #   Two things:
    #     1) the DACPAC is buried, so it is being copied to a pre-defined 'ArtifactStagingDirectory' from the bin folder
    #     2) There are 2 settings files for publishing; one for Automated Testing and another for other environments.  Those need to be copied into the artifacts folder, along with the DACPAC.
    - task: VSBuild@1
      displayName: Build Warehouse Database
      inputs:
       solution: "$(Build.SourcesDirectory)/Warehouse/**/*.sln"
       platform: $(buildPlatform)
       configuration: $(buildConfiguration)
       clean: true
    - powershell: copy-item (get-childItem "$($env:Build_SourcesDirectory)\Warehouse" -Filter *.dacpac -Recurse | select fullname).FullName $($env:Build_ArtifactStagingDirectory)
      displayName: Copy Warehouse DACPAC
    - powershell: copy-item (get-childItem "$($env:Build_SourcesDirectory)\Warehouse" -Filter Default.publish.xml -Recurse | select fullname).FullName "$($env:Build_ArtifactStagingDirectory)\Warehouse.Publish.xml"
      displayName: Copy Warehouse Publish Settings XML
    - task: PublishBuildArtifacts@1
      displayName: Publish Warehouse Database
      inputs:
        PathtoPublish: '$(Build.ArtifactStagingDirectory)'
        ArtifactName: 'BlueMonkey_artifacts'
        publishLocation: 'Container'

With the each repository’s code stored in its own subfolder, as originally intended, when the build task is executed, the code is found and the build process proceeds.

Image #2 – Pipeline logs; Warehouse build succeeded

This was my first major hurdle, and it clearly defined there is a lot going on with each YAML instruction. Implementing pipelines is greatly simplified because of that. At the same time, more care does have to be taken to ensure the pipeline behaves as you expect.

Comments

One response to “Azure Dev Ops – Job-Focused vs Task-Focused”

Leave a Reply


Posted

in

by

%d bloggers like this: