Next week, I’ll be giving a talk “To MSSP or Not to MSSP: Some SOC Questions†at Educause Security Professionals. I’ve never met anyone who has said they love their MSSP. My team and I have been through several POCs with MSSPs, and have used several SOCs with various results. I don’t think there is a silver bullet to solve all of your SOC needs, in part because organizations vary in how they operate. So instead, we’ve put together a list of questions to consider before getting started.
Why do you need a SOC?
I’m a big fan of Simon Sinek’s Start With Why, so this is usually the first question I ask myself about anything. But it’s a serious question here. What are you trying to accomplish? I’ve never met anyone who has said they love their MSSP, so if that’s an expectation going in, then make sure you ask the right questions and understand what the SOC can do and what they can’t. It’s entirely possible that you don’t need a SOC or that you could do it internally…those have to be on the table before you begin. You might not be ready to start your SOC journey yet and you might get more value out of focusing on different things right now.
- Do you need 24×7 monitoring of all “configured†alerts?
- Note I said “configured†alerts, not all alerts. Big difference.
- Does your program have the necessary prerequisites?
- Focus on basics first. A SOC can only provide monitoring of the data you send to them.
- Do you need to free up your team to work on other things?
- You might only be “freeing†them up to manage your MSSP. There will be a learning process, and your team will be chasing down false positives for a while.
- Can you get SaaS visibility?
- If 60% of your environment uses SaaS and you have no visibility into cloud activity, your SOC might not provide value.
Insource or Outsource?
It’s possible to build a SOC with entry level talent so long as you have the capacity for training and understand that these individuals will want to move up or out in a year or two. Many Universities I’ve talked to choose to use student workers to provide their tier 1 monitoring. Retaining talent is a huge challenge, so offering a career path could provide a talent pool for your organization. But you’ll need to make the case for headcount, and to build a 24×7 SOC you’ll need a large number of folks to staff it.
- Do you have the resources to recruit, manage, and train entry level SOC workers?
- This might take away from your other priorities.
- Can you make a case for a budget increase?
- Do you have data or reports that can demonstrate the need?
- Are you ready to have an external team pushing you to mature?
- This was a definite side effect of getting a SOC for us…we found a lot of areas where we needed to mature, and this accelerated our learning curve.
- What level of technical prowess are you able to bring to bear on SOC challenges?
- Parsing and ingest capabilities may vary depending on the design.
SIEM or not to SIEM?
- Do you use theirs or use yours?
- If you take away one thing from reading this post, it’s this point. If you do a co-managed SIEM, then if/when you ultimately decide to switch vendors, all of the maturity in terms of alerts, correlation rules stay with you. The change can be as easy as giving accounts to the new company and shutting off the old accounts.
- Using your MSSP’s SIEM isn’t a bad thing however. Using theirs will mean that they can provide instant value, and their “secret sauce†that they’ve developed for all their other customers can help protect you. If you switch, you lose this and will be starting over from scratch since their “secret sauce†is proprietary. Also, switching might mean having to change lots of log sources and redirect them to another service, which could be weeks of effort.
- Where will your data be kept?
- Do you need it to stay on your premise? If cloud, can you ensure it stays local to your country?
- What happens if you switch vendors?
- Need to have a plan for offboarding.
How many logs is enough?
- Does the cost per log create a disincentive to use all logs?
- Short answer – yes.
- Chatty log sources are often the most valuable…can you tune those sources to send only security related logs?
- Tuning down logs will consume a lot of man hours. Microsoft and others also frequently change how their systems send logs, so this isn’t something you’ll be done with.
- Can Security and Infrastructure use the same logging systems?
- This will mean an order of magnitude increase in logs, but there can be good synergy here. It will also have a big effect on the MSSP you choose.
Can you have Head to Head bakeoff?
We’ve done this, and we’ve gotten wildly different results from each vendor. They all will have strengths, so it’s a matter of understanding in advance what’s important to you. See above, “Start With Whyâ€.
- What are your misuse cases? Has the vendor solved them for other customers?
- Specific use cases will help compare apples to apples.
- How will they keep up with new technology or attacks?
- Is their tech stack flexible enough to stay modern?
- How do they retain talent?
- This can be a real differentiator between vendors. Do they have a career path?
- Is SOC their focus?
- There are a LOT of MSSPs out there, but their companies may specialize in pen testing, or they’re a good VAR, or maybe they provide good architecture consulting. These may or may not mean they’ll be good at SOC.
- Do they have in house resources beyond tier 1?
- Is this something you need or want? What happens if you need incident response or forensics support?
Playbooks
This is where the “secret sauce†really comes into play. How mature is their operation? Can they provide insight to you in advance on what their playbooks look like? Maybe they won’t share this at all before you become a customer because it’s really secret…or maybe it’s just not very mature.
- How will they correlate events to find real incidents?
- How will they hand off incidents to you and who figures out if it’s a false positive?
- How will the MSSP store and share knowledgebase information so they can provide consistent and reliable incident handling, which improves over time?
- How do you negotiate, both up front, and down the road, how much validation is within scope for the MSSP?
Onboarding/Offboarding
Will they have a dedicated account manager to streamline this process? Will you have a dedicated project manager to make sure everything is configured properly? Depending on the architecture choices you made above, this could be an easy process or it could be insanely complicated.
- How will you handle ticketing or case management?
- Yours or theirs? If theirs, how to you hand off to other teams in your organization?
- Will ticketing link directly to view logs or into other investigation tools?
- This could save a huge amount of time while you’re investigating incidents.
- Does your architecture allow you to aggregate logs before forwarding?
- This can make offboarding and onboarding new vendors much simpler.
- Who will manage the process from your team internally?
- Do they have the right skills and relationships inside IT to make this happen smoothly?
- Will they have a dedicated project manager or SOC manager for your account?
- How many additional resources will be required to put everything in place during onboarding (taps, collectors and forwarders, agents on systems, APIs and tokens, etc….)
- I’ve found that this is really hard to document up front and most vendors see this as being too down in the weeds. But this is one of the easiest places to get tripped up.
Blind spots –
I think the point of the SOC should be to eliminate blind spots, so it’s important to make sure you aren’t introducing new ones along the way!
- Do you have all the pieces working that the vendor expects to get the job done? (Endpoint, AV, Network, System logs, App logs, Cloud logs, etc…)
- Will parsing issues might create blindspots?
- Short answer – yes.
- Will the MSSP let you know if a server stops sending logs?
- Maybe the server was decommissioned? Or maybe an attacker was clever enough to stop logging?
Reporting
I’m a CISO, so executive dashboards and reports are probably more important to me than to others, but this can be one area where you’ll start to see big differences between MSSPs.
- Will the MSSP provide regular meetings?
- Weekly? Monthly? Quarterly?
- Did they catch real world events?
- If not, do they have a process for figuring out how they could have found that issue?
- Do they provide weekly, monthly annual reports?
- Written reports are helpful to help spot trends.
- Are they experts with the SIEM platform’s own reporting, for platform performance, trending, and security KPIs?
- If they’re comanaging your SIEM, they should be able to get this right.
Architecture
- Appliances onsite? Forwarding to the cloud?
- Varies greatly depending on the vendor.
- Where are network taps located? Inside the firewall or outside?
- Inside might have lower traffic (less $$) and fewer false positives, but also less visibility.
- How will you handle install issues, OS patching issues, data retention, disposal issues, firewall rules?
- One vendor we did a POC with used open source and all of their appliances already had several critical vulnerabilities.
- How much data will be kept “live†for correlation? Is that enough?
- Is a week ok? Do you need a month or more? How frequently do you do investigations? We’ve found that we often get false positives because some admins only do certain activities every 6 weeks, so we keep getting alerts.
- Secure tunnels to the cloud?
- Tap traffic deduplication?
- You don’t want to pay twice for keeping the same logs.
- Long term archival for compliance? Hot/warm/cold archives?
- You need to know your regulatory obligations for archiving, but generally speaking you probably want a year.
- Where are the resources available to analyze historical incidents?
- Do you employ storage tiering for performance and cost (live on SSD, archives on spindles)?
- Where in the solution do you need to ensure backups and high availability? Ingest pipeline, backend systems, config backups, log backup?
Classification of Assets
- How do you know if an event coming in is critical or not?
- Is it a public webserver on your DMZ or your ERP system?
- How do you know if activity on a log source is normal or not?
- Baselines may not be enough… Are you relying on institutional knowledge to resolve false positives?
- Can the solution and the team leverage inventory data, IP subnet data, high value asset data?
- There are some vendors that require you have all of their tools in the tech stack in order to ensure they’ve got this covered.
- Can the solution maintain tagging, risk ratings of internal users and assets, threat ratings of internal and external assets, including threat intelligence feeds?
Value Added Services
- Can they manage firewalls or other equipment?
- Are you comfortable with them proactively blocking things based on logs?
- Can they help introduce automation to reduce workload/increase response?
- Do they have good auditing to help with troubleshooting when automation breaks something?
- Can they use or access other security tools in your environment?
- If you have an inventory system, or an audit tool for a database, or your CMDB can they VPN in to see those?
- Threat Hunting?
- Proactively finding things in your environment is helpful if you don’t have that skillset in house, but they may need a platform they are familiar with in order to be successful.
- SOC Validation?
- How do you know your SOC is fully instrumented to catch what you expect them to catch? Maybe they offer a scripted attack tool that will run through your use cases to make sure all your tools are logging properly and correlation rules are firing.
Cloud
- Can you get SaaS logs out of the platforms? Might need intermediary cloud storage (e.g. S3 buckets), or supporting cloud applications not yet in place.
- Are those S3 buckets secure?
- Can you package the logs for SIEM processing? Formats tend to be difficult to consume (multiline json) and change far more frequently (monthly).
- There is a lot of care and feeding that goes into this and you should understand going in whether this is your responsibility or theirs.
- How does the SIEM vendor use cloud – they may analyze in the cloud, send full- data or meta-data up to the cloud, where are they located, how are they dependent on cloud?
- What happens during an internet outage? Do you lose visilbity?
- Do they have CASB integration?
- Really helpful for investigations.
2 Responses
Very insightful George….Thank you for posting!
Very informative and thought provoking George. Thanks for sharing.