Cloud Approach
I contributed to the security architecture and security assurance of an HMG Java micro services project. The approach followed Government Digital Services (GDS) guidelines and was being used to replace major HMG systems.
I worked in conjunction with an enterprise transformation architecture team in a central government project team to promote adoption of cloud technology through advocacy, education and design support consulting on best practices for cloud security architecture, design, operations, and service orchestration for a continuous delivery solution to the project.
I worked with a cross functional team of architects to investigate the possibilities of the multi-jurisdictional multi-country aspects of cloud utilisation and data storage in conjunction with the requirements of network segmentation for RIPA. The objective was a desire to use a public cloud type of service so as to benefit from the on-demand, self-service, broad network access, resource pooling, rapid elasticity, measured service benefits of transition to cloud to support the organizational business and mission goals for which the application was to be written.
Considerations were in line with the NIST SP 800-53 cloud architecture standard for public sector security requirements, control implementation and management.
Secure coding methodologies were in place with the intention of acquiring compliance with GDPR, NSC20 and ISO27001 and with a view to future PCI DSS compliance.
Accreditation and risk management for public cloud deployment
IS1 Risk assessment methodology
Risk assessments were initially undertaken using HMG IS1 risk assessment framework (which sits within the HMG enhanced version of the ISO27001 standard, IS1-2. This was used to draft an enterprise risk approach but was not a cultural fit for the program or the department so more informal methods were adopted which was broadly compatible with the NIST (SP) 800-30 risk assessment framework.Cloud based and other risk considerations
The risks of vendor lock in of inability to exit a cloud from a single point of failure in a cloud provider, attack of the application virtual machines from other tenants' virtual machines (guest escape), security of snapshots and general sprawl of cloud based data were considered. Risks to management plane of DOS or interception of control data, control conflict risk and management software related risks were also considered. Non-cloud specific risks of natural disaster, unauthorized facility access, social engineering, network attacks on the consumer and on the provider side, default passwords, and other malicious or non-malicious actions. A defence in depth approach was taken in the mitigation policy for all such risks. There was a consideration of the capabilities of the infrastructure to provide 24*7*365 uptime together with providing automated infrastructure management. A combination of physical and logical controls was to be considered and the demarcation of risk management control responsibility between cloud provider and the application owner.
Security was considered at all parts of the application life cycle management with data, functions and processes all broken into zones according to their classification. A consideration of the impact of the following incidents was considered : when the information/data became widely public and widely distributed (including crossing geographic boundaries); if an employee of the cloud provider accessed the application; if the process or function were manipulated by an outsider; if process or functions failed to provide expected results; if the information/data were unexpectedly changed; if the application were unavailable for a period of time. This was the basis of the testing undertaken against the application as well as considerations of testing the APIs in order to see if the application was 'cloud ready' i.e. ready for public cloud deployment with the data classifications involved. To achieve this new methods would have had to be used to secure the application for cloud deployment. The considerations of cloud hosting administration trust as well as the attack vector from other tenants was considered. Extra monitoring was considered necessary to ensure integrity of the application and its data under the influence of these risks. The obvious way to ensure cloud readiness would be the in-building of encryption at the field, file, store in transit and at rest together with the requirement for solid key management and encryption engine deployment. All these to be considered in the Software Development Life Cycle (SDLC) stages of business and security requirements, design requirements, development, testing, secure operation and disposal.
Additional risk assessment : threat modelling
An additional approach to risk assessment was the threat modelling of the application data at each stage of its processing was to be considered using the STRIDE model. This covered off threats from Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege. The software assurance would encompass the development and implementation of methods and processes for ensuring that software functions as intended while mitigating the risks of vulnerabilities, malicious code, or defects that could bring harm to the end user.
The Estimation of Jurisdictional and Compliance requirement
An approach was considered whereby the application was broken down to the functions and services that have legal jurisdictional implications from those that don’t to ascertain the overall security posture of the application.
The use of ISO 27034-1 was considered in order to set out the Organizational Normative (security) Framework (ONF) for the Business, Regulatory, Technical Context, and Specifications together with the roles, responsibilities, and qualifications of the actors and the Application Security Control Library needed to meet the target level of trust.
Stakeholder management
Consideration was given to the identification and involvement of relevant stakeholders and to ensure that the risk assessment process in use was in line with the corporate business risk assessment process. This was quite difficult in practice because the risk appetites were different for different groups of stakeholders.
Communication channels were in operation for corporate stakeholders involved in compliance, audit, risk, legal, finance, operations, data protection/privacy and with executive committee/directors involvement as well as consideration of the need of the Information Commissioner ( UK PII regulator).
Code Data and Infrastructure security testing
Library / Dependency checking
There was concern about the assurance of the code libraries in use. I arranged for library code dependency checking using the Blackduck utility.
Code In-Sprint Checking
Verification and validation of coding at each stage of the development process was required through functional and security testing and code checks.
I arranged the Static application security testing (SAST) and penetration application testing ( DAST Dynamic Application Security Testing ) for at each stage of the SDLC.
Application Security testing
The application was tested against the OWASP top 10 for the following vulnerability groups :-
1. Injection: (injection of flaws),
2. Broken Authentication and Session Management (often not implemented correctly),
3. Cross-site Scripting (XSS): (application takes untrusted data and sends it to a web browser without proper validation or escaping),
4. Insecure Direct Object References : (when a developer exposes a reference to an internal implementation object without an access control check or other protection )
5. Security Misconfiguration ( not having a secure configuration defined and deployed for the application, frameworks, application server, web server, database server, and platform),
6. Sensitive Data Exposure (not properly protecting sensitive data, such as credit cards, tax IDs, and authentication credentials),
7. Missing Function Level Access Control (applications not performing access control checks on the server when each function is accessed),
8. Cross-site Request Forgery (CSRF): (an attack that forces a logged-on victim’s browser to send a forged HTTP request, including the victim’s session cookie and any other automatically included authentication information, to a vulnerable web application),
9. Using Components with Known Vulnerabilities,
10. Unvalidated Redirects and Forwards (redirecting and forwarding users to other pages and websites, using untrusted data to determine the destination pages).
Formal Software and Infrastructure testing and remediation
I scoped staged pen tests under HMG CHECK rules together with final code security inspection using VCG, at key deployment stages. Vulnerabilities so identified were then mitigated by building Jira mitigation tasks into workstack sprints. Remediation was completed before code was deployed.
Data security
Data Security Technologies
The basis to this is to apply Data Security Technologies e.g. encryption, data leakage prevention (DLP), Audit and File and database access monitors : ( FIM and DAM ) to prevent unauthorized data exfiltration and the detection of unauthorized access to data stored in files and databases.There was a consideration of the use of obfuscation, anonymization, tokenization, and the masking of data.
Encryption
Various kinds of key management architectures that were possible with and without hardware security modules ( HSM : FIPS-140-2 compliant key management) as well as where encryption should take place bearing in mind the replication possibilities of data in public cloud.
The effect of encryption on the indexing of data was considered early in the design of the architecture as well as its effect on the collection of metadata and its effect on performance ( i.e. whether separate encryption infrastructure could be provided) through the consideration of the data security lifecycle and the possible use of homomorphic processing of encrypted data that is without prior decryption. The nirvana of application encryption (at field level ) was not built into the application being developed : this could have been a key enabler of the use of public cloud provided the risks of doing so could have been managed. There are many sorts of encryption which make the use of public cloud a business possibility for many applications. These are proxy level encryption, database level encryption, storage level encryption, application level encryption, OS volume level encryption, file level encryption which if properly implemented in the application development life cycle could enable the very cost effective flexible elastic use of public cloud resources provided key management is managed correctly, key management and key escrow. There would need to have due regard to the design of the infrastructure for the data, the encryption engine and the key management infrastructure and process. Another technology under consideration could have been bitsplitting where data is stored across multiple clouds in much the same way as RAID storage splits data for hardware resilience.
Data Classification and PII
The inputs to this consideration are data type (format, structure), jurisdiction (of origin, domiciled) and other legal constraints, context, ownership, contractual or business constraints, trust levels and source of origin, value, sensitivity, and criticality (to the organization or to third party) and the obligation for retention and preservation of data This was used to identify the Personally Identifiable Information ( PII) in the datasets so that in conjunction with an understanding of the jurisdictional considerations of applicable law ( e.g. the EU Privacy Directive and the General Data Protection Regulation
), the contractual and regulatory requirements of the use and storage of PII could be understood and delivered in the design. This analysis can then be used as a driver to understand where the roles of controller and processor sit in the application and its hosting.
Data retention policy
A data retention policy was considered in regard to regulation requirements, data mapping, data classification, data retention procedure, and the monitoring and maintenance of the policy.
DEV Security
Application lifecycle management tools
Maven type architectural styles were in use namely Web services, SOA, REST API, Micro services and the code versioning tools in use was GITHUB
Application Lifecycle Management Tools: JIRA, Confluence, Jenkins,
Development frameworks in use
Agile frameworks and practices: Scrum, user stories, DevOps, Continuous Integration & Delivery, Pair Programming
Security awareness training
I provided security awareness training for the whole business process management and Dev team.
Security Architecture and security Procedures
I consulted on security architecture for new additions and connections for the project. Procedures were documented in Confluence and projects initiated and continuously monitored in Jira.
Data Centre Security
Data centre performance
As stated above the physical location of the data centre affects the jurisdictional treatment of PII handling. Consideration of type of cloud service in use PaaS, IaaS, SaaS determines the decisions to be made in regard to operation of the application in its data centre environment.
The requirements of automating service enablement, consolidation of monitoring capabilities, optimising the Mean Time to Repair (MTTR), the Mean Time Between Failure (MTBF) in the hosting service provision were to be considered.
Physical design
The physical design should cover off clean water, clean power, food, telecommunications, accessibility during/after a disaster. Physical security design features that limit access to authorized personnel. Some examples include perimeter protections such as walls, fences, gates, and electronic surveillance. Access control points to control ingress and egress and to verify identity and access authorization with an audit trail; this includes egress monitoring to prevent theft.
Physical and Virtual device infrastructure
The right balance between virtual and physical networking control has to be reached to suit the isolation requirements of private and public clouds in the application to be hosted. Technical capabilities of logical network separations are driven by 802.1Q tagging which can link 10 virtual ports to one physical router port. A consideration took place as to whether this allowed for sufficient separation in the infrastructure for the application to be hosted.
VMware in combination with Docker containerisation
IAAS architecture was required to support the VMware NSX platform in conjunction with Docker where Vmware was used to close a critical hole in the Docker architecture to enable the secure functioning of the Docker daemon.
Data Centre Evaluation
On management side we needed to be able to assess patch management performance monitoring, log capture, SIEM, virtual and physical networking , backup and secure management of virtual machine images, scheduling , orchestration of the data centre all of which did not appear to be sufficiently transparent for our needs in the IAAS architectures available for the application.
Forensics
In addition, it was not clear how forensics was going to run in the cloud environments under consideration. This involved a consideration of the collection, acquisition and preservation of digital evidence. Roles and responsibilities for forensics activity would have to be agreed between host and client/user.
Architecture
There was an attempt to obtain GDPR readiness in regard to a single set of rules, one-stop shop, responsibility and accountability, consent, a data protection officer, pseudonymisation, data breaches handling , sanctions considerations, right to erasure, data portability, and privacy by design and by default.There was a consideration of XML gateways at API interfaces to deliver DLP,
antivirus and anti-malware services as required.
A sandboxing approach was taken to the zoning of the application with database containing sensitive records zoned away from the rest of the application.
We considered the TCI Reference Architecture tools and methodology to enable the leveraging of a common set of solutions to fulfil the needs to assess where department IT and potential cloud providers are in terms of security capabilities; to attempt to produce a roadmap to meet the security needs of the business while preventing vendor lock in. A consideration of various vertical specialist cloud providers was given which included the incumbent government accredited cloud provider upon which the IL3 dev system was being hosted.
Openstack hybrid cloud considerations of a private open stack cloud in combination with using public clouds (AWS or Azure) was considered.
Enterprise level IAM Hortonworks Apache Ranger was in use to deliver open source access controls across the application through policy enforcement points and policy decision points with integration to the SAML federation standard across the whole enterprise application.
OPNsense open source nextgen firewalling was built into several zones of the application. Application firewalls ( WAF ) were designed into the front end of the hosting architecture.
SIEM monitoring and forensics
On the SIEM front and in the Docker micro services environment Kubernetes Logging With Elasticsearch and Kibana was developed for the system. Qradar, Arcsight, Splunk and been considered for this but did not suited to the scaling and elastic architecture which was projected to be in use.
The SIEM platform provided alerting, dashboards, compliance, retention and forensic analysis in a system of continuous suspect activity detection rules. Forensics activity support was to be provided by support of a chain of custody process covering off the collection, possession, condition, location, transfer, access to and any analysis of events of interest to provide evidential quality of data with non-repudiation.
BCDR
A consideration of the options of, 1) On-premise with cloud as BCDR, 2) Cloud Consumer primary provider BCDR, or 3) Cloud Consumer, alternative provider BCDR for the BCDR design with the BCDR strategy then focussing on the capability of restoration of service or failover. The benefits of cloud BCDR are that cloud infrastructure providers have resilient infrastructure, and an external BCDR provider has the potential for being very experienced and capable as their technical and people resources are being shared across a number of tenants, pay-per-use cloud can mean that the total BCDR strategy could have been a lot cheaper. These requirements were built into the NFRs ( Non Functional Requirements ) together with the plan and plan testing requirements.Data centre Security management
On the management front we were to investigate incident, problem , release and configuration management operations to understand how availability, capacity, continuity, information security, service improvement, incident, problem, release, deployment, configuration management worked in practice. We found that there was insufficient transparency from the suppliers (in so far as considerations took place) in the cloud providers investigated. In addition there was not an AWS region established in the UK at the time for which SOC reports were available.
Conclusion on the use of public cloud for the project
Public Cloud data centres
The key element of the client's senior management desire was to use public cloud with its multi-tenant networks. In a nutshell, the data centre networks are logically divided into smaller, isolated networks sharing the same physical networking gear but operating on their own logical networks without visibility into the other logical networks.
This architecture being delivered using approved hypervisor architecture managed in the management plane of the data centre or datacentres together with approved communications access, secure storage backup and disaster recovery.
The data centre management was to cover off segregation of duties, design for monitoring of network traffic, automation and the use of APIs, enforceable logical design decisions, and access management system that can be audited. A consideration was made of software-defined networking (SDN) tools to support logical isolation. Compute nodes, management plane, storage nodes, control plane and network were to be approved by the client. In a public cloud situation, to achieve this the entire data centre was to be viewed as an integrated combination designed to run at the highest possible efficiency level, which requires custom-designed sub-components to ensure they contribute to the overall efficiency goal. However, HMG audit of all of this was problematic, at that time.
Physical data centre requirements
The tier level of datacentres as defined by the Telecommunications Industry Association in the US were considered. Only Tier 4 ( Fault Tolerant Site Infrastructure ) was considered satisfactory as only this was considered reliable as a single site 365 day configuration. This should be the default state of a cloud supplier data centre..
Public Cloud security conclusions
So security consultancy was considered for the use of cloud logical architectures together with a consideration of cloud hosting for the use of Amazon Web Services ( AWS ) and Microsoft Cloud ( Azure ) services.
These offerings were found to be inappropriate at that time, as neither of them could offer UK based data hosting services to manage the HMG PII requirement as well as other shortcomings in the area of data centre audit documented. Also the application development process and technology in use did not allow for sufficient cloud application controls to be in place to provide for secure use of public cloud hosting.
So, in summary I ....
Lead the Accreditation process for internally managed systems, providing the programme with a conduit to Accreditors
Chaired Security Working Groups for Systems Programmes
Liaised with the Technical Design Authorities across Programmes, and CESG to ensure compliance with (or support exemptions from) departmental Security Architecture Principles
Contributed to the development of Non Functional security requirements to Programme Teams
Managed Programme level Security Testing strategy including external CHECK ITHCs, automated tools and internal penetration testing teams.
Provided security briefings for new Programmes
Reviewed Technical Proposals from suppliers to ensure adherence to HMG security requirements and risk management within programme risk parameters
Provided advice and guidance on security architecture principles and standards to Programme Colleagues, Suppliers and Managers (i.e. Technical Design Authority)
Analysed security technology industry and market trends, and determine their potential impact on the enterprise (and opportunities for utilisation, as documented above).